Capstone Project: Facial Emotion Detection¶

Michael Hogge - April 2023

Executive Summary¶

Emotion AI, also known as artificial emotional intelligence, is a subset of artificial intelligence dealing with the detection and replication of human emotion by machines. The successful creation of this "artificial empathy" hinges on a computer's ability to analyze, among other things, human text, speech, and facial expressions. In support of these efforts, this project leverages the power of convolutional neural networks (CNN) to create a computer vision model capable of accurately performing multi-class classification on images containing one of four facial expressions: happy, sad, neutral, and surprise.

Data provided for this project includes over 20,000 grayscale images split into training (75%), validation (24.5%), and test (0.5%) datasets, and further divided into the aforementioned classes. At the outset of the project, a visual analysis of the data is undertaken and a slight imbalance is noted in the class distribution, with 'surprise' images making up a smaller percentage of total images when compared to 'happy,' 'sad,' and 'neutral' images. The unique characteristics of each class are discussed (e.g., images labeled as 'surprise' tend to contain faces with wide open mouths and eyes), including a breakdown of average pixel value by class.

Following the data visualization and analysis phase of the project, nine CNNs are developed, ranging from simple grayscale models to complex transfer learning architectures comprised of hundreds of layers and tens of millions of parameters. Basic models are shown to be lacking the required amount of complexity to properly fit the data, while the transfer learning models (VGG16, ResNet v2, and EfficientNet) are shown to be too complex for the amount and type of data provided for this project. The unsatisfactory performance of the basic and transfer learning models necessitates the development of an alternative model capable of fitting the data and achieving acceptable levels of accuracy while maintaining a high level of generalizability. The proposed model, with four convolutional blocks and 1.8 million parameters, displays high accuracy (75% on training, validation, and test data) when compared to human performance (±65%) on similar data, and avoids overfitting the training data, which can be difficult to achieve with CNNs.

The deployability of this model depends entirely on its intended use. With an accuracy of 75%, deployment in a marketing or gaming setting is perfectly reasonable, assuming consent has been granted, and the handling of highly personal data is done in an ethical, transparent manner with data privacy coming before profit. However, deployment in circumstances where the output from this model could cause serious material damage to an individual (e.g., hiring decisions, law enforcement, evidence in a court of law, etc.) should be avoided. While computer vision models can become quite skilled at classifying human facial expressions (particularly if they are trained on over-emoting/exaggerated images), it is important to note that a connection between those expressions and any underlying emotion is not a hard scientific fact. For example, a smiling person may not always be happy (e.g., they could be uncomfortable or polite), a crying person may not always be sad (e.g., they could be crying tears of joy), and someone who is surprised may be experiencing compound emotions (e.g., happily surprised or sadly surprised).

There is certainly scope to improve the proposed model, including the ethical sourcing of additional, diverse training images, and additional data augmentation on top of what is already performed during the development of the proposed model. In certain scenarios, as indicated above, model deployment could proceed with 75% accuracy, and continued improvement could be pursued by the business/organization/government as time and funding allows. Before model deployment, a set of guiding ethical principles should be developed and adhered to throughout the data collection, analysis, and (possibly) storage phase. Stakeholders must ensure transparency throughout all stages of the computer vision life cycle, while monitoring the overall development of Emotion AI technology and anticipating future regulatory action, which appears likely.

Problem Definition¶

Context:
How do humans communicate with one another? While spoken and written communication may immediately come to mind, research by Dr. Albert Mehrabian has found that over 50% of communication is conveyed through body language, including facial expressions. In face-to-face conversation, body language, it turns out, plays a larger role in how our message is interpreted than both the words we choose, and the tone with which we deliver them. Our expression is a powerful window into our true feelings, and as such, it can be used as a highly-effective proxy for sentiment, particularly in the absence of written or spoken communication.

Emotion AI (artificial emotional intelligence, or affective computing), attempts to leverage this proxy for sentiment by detecting and processing facial expression (through neural networks) in an effort to successfully interpret human emotion and respond appropriately. Developing models that can accurately detect facial emotion is therefore an important driver of advancement in the realm of artificial intelligence and emotionally intelligent machines. The ability to successfully extract sentiment from images and video is also a powerful tool for businesses looking to conjure insights from the troves of unstructured data they have accumulated in recent years, or even to extract second-by-second customer responses to advertisements, store layouts, customer/user experience, etc.

Objective:
The objective of this project is to utilize deep learning techniques, including convolutional neural networks, to create a computer vision model that can accurately detect and interpret facial emotions. This model should be capable of performing multi-class classification on images containing one of four facial expressions: happy, sad, neutral, and surprise.

Key Questions:

  • Do we have the data necessary to develop our models, and is it of good enough quality and quantity?
  • What is the best type of machine learning model to achieve our objective?
  • What do we consider 'success' when it comes to model performance?
  • How do different models compare to one another given this definition of success?
  • What are the most important insights that can be drawn from this project upon its conclusion?
  • What is the final proposed model and is it good enough for deployment?

About the Dataset¶

The data set consists of 3 folders, i.e., 'test', 'train', and 'validation'. Each of these folders has four subfolders:

‘happy’: Images of people who have happy facial expressions.
‘sad’: Images of people with sad or upset facial expressions.
‘surprise’: Images of people who have shocked or surprised facial expressions.
‘neutral’: Images of people showing no prominent emotion in their facial expression at all.

Importing the Libraries¶

In [1]:
import zipfile                        # Used to unzip the data

import numpy as np                    # Mathematical functions, arrays, etc.
import pandas as pd                   # Data manipulation and analysis
import os                             # Misc operating system interfaces
import h5py                           # Read and write h5py files
import random


import matplotlib.pyplot as plt       # A library for data visualization
from matplotlib import image as mpimg # Used to show images from filepath 
import seaborn as sns                 # An advanced library for data visualization
from PIL import Image                 # Image processing
import cv2                            # Image processing

# Importing Deep Learning Libraries, layers, models, optimizers, etc
import tensorflow as tf
from tensorflow.keras.preprocessing.image import load_img, img_to_array, ImageDataGenerator
from tensorflow.keras.layers import Dense, Input, Dropout, SpatialDropout2D, GlobalAveragePooling2D, Flatten, Conv2D, BatchNormalization, Activation, MaxPooling2D, LeakyReLU, GaussianNoise
from tensorflow.keras.models import Model, Sequential
from tensorflow.keras.optimizers import Adam, SGD, RMSprop, Adadelta
from tensorflow.keras import regularizers
from keras.regularizers import l2
from tensorflow.keras.losses import categorical_crossentropy
from tensorflow.keras.utils import to_categorical
import tensorflow.keras.applications as ap
from tensorflow.keras.applications.vgg16 import VGG16
from keras.callbacks import ModelCheckpoint, EarlyStopping, ReduceLROnPlateau
from tensorflow.keras import backend

# Reproducibility within TensorFlow
import fwr13y.d9m.tensorflow as tf_determinism
tf_determinism.enable_determinism()
tf.config.experimental.enable_op_determinism

from tqdm import tqdm                 # Generates progress bars

# Predictive data analysis tools
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import MinMaxScaler
from sklearn.metrics import classification_report
from sklearn.metrics import confusion_matrix

# To suppress warnings
import warnings
warnings.filterwarnings("ignore")

# Needed to silence tensorflow messages while running locally
from silence_tensorflow import silence_tensorflow
silence_tensorflow()
fwr13y.d9m.tensorflow.enable_determinism (version 0.4.0) has been applied to TensorFlow version 2.9.0
In [2]:
# Fixing the seed for random number generators to ensure reproducibility
np.random.seed(42)
random.seed(42)
tf.random.set_seed(42)

# Ensuring reproducibility using GPU with TensorFlow
os.environ['TF_DETERMINISTIC_OPS'] = '1'

Loading and Unzipping the Data¶

In [3]:
# Extracting image files from the zip file
with zipfile.ZipFile("Facial_emotion_images.zip", "r") as zip_ref:      
    zip_ref.extractall()
In [4]:
dir_train = "Facial_emotion_images/train/"                     # Path of training data after unzipping
dir_validation = "Facial_emotion_images/validation/"           # Path of validation data after unzipping
dir_test = "Facial_emotion_images/test/"                       # Path of test data after unzipping
img_size = 48                                                  # Defining the size of the image as 48 pixels

Visualizing our Classes¶

In [5]:
# Custom function to display first 35 images from the specified training folder

def display_emotion(emotion):
    train_emotion = dir_train + emotion + "/"
    plt.figure(figsize = (11, 11))
    
    for i in range(1, 36):
        plt.subplot(5, 7, i)
        img = load_img(train_emotion +
                    os.listdir(train_emotion)[i],
                    target_size = (img_size, img_size))
        plt.imshow(img)
        
    plt.show()

Happy¶

In [6]:
print("These are the first 35 training images labeled as 'Happy':")
display_emotion("happy")
These are the first 35 training images labeled as 'Happy':
In [7]:
# An example image pulled from the images above

img_x = os.listdir("Facial_emotion_images/train/happy/")[16]
img_happy_16 = mpimg.imread("Facial_emotion_images/train/happy/"+img_x)
plt.figure(figsize = (2, 2))
plt.imshow(img_happy_16, cmap='Greys_r')
plt.show()

Observations and Insights: Happy

  • In most images, the person is smiling. Some smiles are with an open mouth with teeth visible, and some are with closed lips. Our models will need to learn both types of smiles.
  • Image brightness varies considerably and will need to be addressed with data augmentation.
  • The ages of the people vary from very young to old.
  • In some images, people are wearing eyeglasses or hats, eating food, or are partially covering their face with their hands. Some images contain watermarks.
  • Some images are head-on while some are sideways. We will address this via data augmentation (rotating and/or flipping images).
  • Images are cropped differently and this will need to be addressed with data augmentation (zoom/crop).
  • Some images do not contain a face (though not shown in the above 35). There is really nothing to do about this ... some of the test images also do not contain a face. As a result, some predictions by the final model will be based on incorrect data labeling.
  • As highlighted by the image above, some images are of non-human faces. In this case, it appears to be a statue with exaggerated features.
  • Some 'happy' images would clearly not be classified as 'happy' by a human being. This brings into question what accuracy really means. If the model correctly predicts an image labeled 'happy' as 'happy', should that be considered accurate if the person in the image is actually frowning and would be considered by a human being to be sad? In a high-stakes, real world situation we could potentially relabel images that have been incorrectly labeled, but in the context of this project, we have been advised to leave the training labels untouched.

Sad¶

In [8]:
print("These are the first 35 training images labeled as 'Sad':")
display_emotion("sad")
These are the first 35 training images labeled as 'Sad':
In [9]:
# An example image pulled from the images above 

img_x = os.listdir("Facial_emotion_images/train/sad/")[7]
img_sad_7 = mpimg.imread("Facial_emotion_images/train/sad/"+img_x)
plt.figure(figsize = (2, 2))
plt.imshow(img_sad_7, cmap='Greys_r')
plt.show()

Observations and Insights: Sad

  • In most images, the person is frowning. In many images, people have their eyes closed or are looking down.
  • Compared to the 'happy' images, people labeled 'sad' seem more likely to have their mouths closed.
  • Similar to the 'happy' images, image brightness varies considerably, as does age. Some images are head-on while others are sideways. Some people are covering their face with their hands. As with 'happy' images, 'sad' images are also cropped differently, while some also have watermarks.
  • Some images do not contain a face (though not shown in the above 35).
  • As highlighted by the image above, some images are labeled 'sad' but would probably not be classified as sad by a human being. The person above appears to be smiling, but an accurate prediction by one of our models would classify the image as 'sad'. This raises the same issue about accuracy mentioned earlier.
  • At first glance, apart from the images that are clearly mislabeled, there appears to be enough difference between the 'happy' and 'sad' characteristics that an effective model should be able to tell them apart relatively easily.

Neutral¶

In [10]:
print("These are the first 35 training images labeled as 'Neutral':")
display_emotion("neutral")
These are the first 35 training images labeled as 'Neutral':
In [11]:
# An example image pulled from the images above 

img_x = os.listdir("Facial_emotion_images/train/neutral/")[26]
img_neutral_26 = mpimg.imread("Facial_emotion_images/train/neutral/"+img_x)
plt.figure(figsize = (2, 2))
plt.imshow(img_neutral_26, cmap='Greys_r')
plt.show()

Observations and Insights: Neutral

  • At first glance, this seems to be the most difficult label to accurately predict. While 'happy' and 'sad' images appear different enough that a model should be able to tell the difference, 'neutral' faces are in between 'happy' and 'sad', and consequently similar to both.
  • Similar to the other classes discussed above, differences in brightness, age, zoom, hands covering faces, etc. are apparent in the 'neutral' images as well.
  • As highlighted in the image above, some images are simply mistakes and do not contain any faces at all.
  • These neutral images seem more difficult for a human being to correctly classify. Some people appear to be slightly smiling, while others appear to be slightly frowning. This begs the question, where are the lines between happy/neutral and sad/neutral? Neutral images do appear to be more similar to sad images, so it is possible that our models will confuse the two classes.

Surprise¶

In [12]:
print("These are the first 35 training images labeled as 'Surprise':")
display_emotion("surprise")
These are the first 35 training images labeled as 'Surprise':
In [13]:
# An example image pulled from the images above

img_x = os.listdir("Facial_emotion_images/train/surprise/")[17]
img_surprise_34 = mpimg.imread("Facial_emotion_images/train/surprise/"+img_x)
plt.figure(figsize = (2, 2))
plt.imshow(img_surprise_34, cmap='Greys_r')
plt.show()

Observations and Insights: Surprise

  • The most unique characteristics of the 'surprise' images are open mouths and big, open eyes. These seem like features that a successful model should be able to identify and accurately classify. There is a big difference between 'surprise' images and 'neutral' images, for example. It is possible, however, that the open mouth of a 'happy' smile and the open mouth of a 'surprise' image could be difficult for a model to distinguish between.
  • As with the above classes, brightness, crop, age, etc. vary between images. Hands are often covering faces. Some photos are head-on while others face sideways.
  • The above image is an example with a very light pixel value, as opposed to one of the much darker images. The person in the image has the classic open mouth and wide open eyes. The image also contains a watermark.
  • Some images do not contain a face (though not shown in the above 35).

Overall Insights from Visualization of Classes:

  • All images are in grayscale (black/white) and image size is 48 x 48 pixels. We will need to rescale pixel values by dividing by 255 so pixel values are normalized between 0 and 1. This will allow our models to train faster and help to stabilize gradient descent.
  • Some classes have very different characteristics (happy/sad) while other classes are more similar (sad/neutral) and could be challenging for a model to accurately classify.
  • There is a wide range of differences with respect to image brightness, age of person, zoom/crop of image, orientation of the face, objects/hands covering the face, images not containing any face at all, etc. Data augmentation will need to be taken into consideration, and this will be handled when the Data Loaders are created below.
  • Visualizing the images in this way raises an important question: what do we consider an accurate model prediction for an image that is clearly mislabeled? If a person is smiling but labeled as 'sad', and the model accurately predicts 'sad', is that really 'accurate' since a human being would classify the image as 'happy'? If the test data set is large, it would be difficult to go through each image individually to check accuracy (which would also defeat the purpose of creating a computer vision model in the first place), so we will have to live with 'accurate' model predictions that may not be truly accurate. Pondering questions like this lead one to believe that a model can really only be as good as the data it is trained on.

Checking Distribution of Classes¶

In [14]:
# Getting the count of images in each training folder and saving to variables

train_happy = len(os.listdir(dir_train + "happy/"))
train_sad = len(os.listdir(dir_train + "sad/"))
train_neutral = len(os.listdir(dir_train + "neutral/"))
train_surprised = len(os.listdir(dir_train + "surprise/"))

# Creating a Pandas series called "train_series" and converting to Pandas dataframe called "train_df"
# in order to display the table below. The dataframe will also contribute to bar charts farther below.

train_series = pd.Series({'Happy': train_happy, 'Sad': train_sad, 'Neutral': train_neutral, 
                          'Surprised': train_surprised})
train_df = pd.DataFrame(train_series, columns = ['Total Training Images'])
train_df["Percentage"] = round((train_df["Total Training Images"] / train_df["Total Training Images"].sum())*100, 1)
train_df.index.name='Emotions'

print("The distribution of classes within the training data:")
train_df
The distribution of classes within the training data:
Out[14]:
See Full Dataframe in Mito
Total Training Images Percentage
Emotions
Happy 3976 26.3
Sad 3982 26.4
Neutral 3978 26.3
Surprised 3173 21.0
In [15]:
train_df.sum()
Out[15]:
Total Training Images    15109.0
Percentage                 100.0
dtype: float64

Observations: Training Images

  • There are 15,109 training images in total.
  • Happy, sad, and neutral images make up roughly the same share of total training images (26%), while surprise images make up a smaller share (21%). At this stage it is important to note the relatively small imbalance, though the ratio does not seem skewed enough to warrant future manipulation in terms of weights, etc.
  • The insight made above, that surprise images seem to be some of the most unique in terms of characteristics (big open mouth, big open eyes), may actually help us overcome the relatively minor imbalance. There are fewer surprise images, but they may be easier to classify.
In [16]:
# Getting count of images in each validation folder and saving to variables

val_happy = len(os.listdir(dir_validation + "happy/"))
val_sad = len(os.listdir(dir_validation + "sad/"))
val_neutral = len(os.listdir(dir_validation + "neutral/"))
val_surprised = len(os.listdir(dir_validation + "surprise/"))

# Creating a Pandas series called "val_series" and converting to Pandas dataframe called "val_df"
# in order to display the table below. The dataframe will also contribute to bar charts farther below.

val_series = pd.Series({'Happy': val_happy, 'Sad': val_sad, 'Neutral': val_neutral, 
                        'Surprised': val_surprised})
val_df = pd.DataFrame(val_series, columns = ['Total Validation Images'])
val_df["Percentage"] = round((val_df["Total Validation Images"] / val_df["Total Validation Images"].sum())*100, 1)
val_df.index.name='Emotions'

print("The distribution of classes within the validation data:")
val_df
The distribution of classes within the validation data:
Out[16]:
See Full Dataframe in Mito
Total Validation Images Percentage
Emotions
Happy 1825 36.7
Sad 1139 22.9
Neutral 1216 24.4
Surprised 797 16.0
In [17]:
val_df.sum()
Out[17]:
Total Validation Images    4977.0
Percentage                  100.0
dtype: float64

Observations: Validation Images

  • There are 4,977 validation images in total.
  • The distribution across classes is much more imbalanced. Happy images make up almost 37% of total validation images, while surprise images make up only 16%. As the training images and validation images are already split and provided as is, it is not a simple matter of randomly splitting training data with a train/test split. We are stuck with the imbalance.
  • One solution to address the imbalance could be to cap the other classes at the level of the surprise class, but that would throw away a huge portion of our already small data set.
  • As mentioned above, we can surmise that surprise images are easier to classify because of their unique characteristics, and we will see if that is enough to offset the relatively smaller sample size with which to train and validate.
In [18]:
# Getting count of images in each test folder and saving to variables

test_happy = len(os.listdir(dir_test + "happy/"))
test_sad = len(os.listdir(dir_test + "sad/"))
test_neutral = len(os.listdir(dir_test + "neutral/"))
test_surprised = len(os.listdir(dir_test + "surprise/"))

# Creating a Pandas series called "test_series" and converting to Pandas dataframe called "test_df"
# in order to display the table below. The dataframe will also contribute to bar charts farther below.

test_series = pd.Series({'Happy': test_happy, 'Sad': test_sad, 'Neutral': test_neutral, 
                        'Surprised': test_surprised})
test_df = pd.DataFrame(test_series, columns = ['Total Test Images'])
test_df["Percentage"] = round((test_df["Total Test Images"] / test_df["Total Test Images"].sum())*100, 1)
test_df.index.name='Emotions'

print("The distribution of classes within the validation data:")
test_df
The distribution of classes within the validation data:
Out[18]:
See Full Dataframe in Mito
Total Test Images Percentage
Emotions
Happy 32 25.0
Sad 32 25.0
Neutral 32 25.0
Surprised 32 25.0
In [19]:
test_df.sum()
Out[19]:
Total Test Images    128.0
Percentage           100.0
dtype: float64

Observations: Test Images

  • There are 128 test images in total, evenly divided between all four classes.
  • This even distribution will make interpretation of the final confusion matrix very straightforward.
In [20]:
# Concatenating train_df, val_df, and test_df to create "df_total" in order to create the chart below

df_total = pd.concat([train_df, val_df, test_df], axis=1)
df_total.drop(['Percentage'], axis=1, inplace=True)
df_total = df_total.reset_index()
df_total.rename(columns={"index":"Emotions", "Total Training Images":"Train", 
                   "Total Validation Images":"Validate", "Total Test Images":"Test"}, inplace=True)

# Creating bar chart below, grouped by class (i.e. 'emotion') and broken down into "train", "validate", 
# and "test" data. The x-axis is Emotions and the y-axis is Total Images.

df_total.groupby("Emotions", sort=False).mean().plot(kind='bar', figsize=(10,5), 
                            title="TOTAL TRAINING, VALIDATION and TEST IMAGES", 
                            ylabel="Total Images", rot=0, fontsize=12, width=0.9, colormap="Pastel2", 
                            edgecolor='black')
plt.show()

Observations:

  • Depicted graphically, the distribution of classes is clearly imbalanced, but the imbalance is not overpowering.
  • Perhaps most striking is the tiny proportion of test images compared to training images. Rather than a standard machine learning train/validation/test split of 80/10/10 or 70/20/10, the data as provided for this project is 75% training, 24.5% validation, and just 0.5% test. As the data is provided already split into groups, we will work as intended. The vast majority of data will be used to train and then validate our models, with a tiny proportion used for testing. This should work in our favor, maximizing the amount of data used by our models to train.
In [21]:
# Concatenating train_df, val_df, and test_df to create "df_percent" in order to create the chart below

df_percent = pd.concat([train_df, val_df, test_df], axis=1)
df_percent.drop(['Total Training Images', 'Total Validation Images', 'Total Test Images'], axis=1, inplace=True)
df_percent.columns = ["Train", "Validate", "Test"]

# Creating bar chart below, grouped by class (i.e. 'emotion') and broken down into "train", "validate", 
# and "test" data. The x-axis is Emotions and the y-axis is Percentage of Total Images.

df_percent.groupby("Emotions", sort=False).mean().plot(kind='bar', figsize=(10,5), 
                            title="PERCENTAGE OF TOTAL TRAINING, VALIDATION and TEST IMAGES", 
                            ylabel="Percentage of Total Images", rot=0, fontsize=12, width=0.9, colormap="Pastel2", 
                            edgecolor='black')
plt.show()

Observations:

  • A visual depiction of what was discussed earlier. We can see the percentage breakdown of train/validate/test data across classes.
  • Training data is evenly distributed across happy, sad, and neutral classes, with fewer surprise images.
  • Within the validation data set, happy images clearly make up the largest percent of total images, with surprise images coming in a distant last place.
  • Happy images make up a much larger percentage of the validation data set than they do of the training and test data sets.
  • Surprise images make up a larger percentage of the test data set than they do of the training and validation data sets.
In [22]:
# Obtaining the average pixel value for training images in the class 'Happy'
list_train_happy = []

for i in range(len(os.listdir("Facial_emotion_images/train/happy/"))):
    list_x = []
    x = os.listdir("Facial_emotion_images/train/happy/")[i]
    im = Image.open("Facial_emotion_images/train/happy/"+x, 'r')
    pix_val = list(im.getdata())
    for j in range(len(pix_val)):
        list_x.append(pix_val[j])
    list_train_happy.append(sum(list_x)/len(pix_val))

train_happy_pixel_avg = round(sum(list_train_happy)/len(list_train_happy), 2)

# Obtaining the average pixel value for validation images in the class 'Happy'
list_val_happy = []

for i in range(len(os.listdir("Facial_emotion_images/validation/happy/"))):
    list_x = []
    x = os.listdir("Facial_emotion_images/validation/happy/")[i]
    im = Image.open("Facial_emotion_images/validation/happy/"+x, 'r')
    pix_val = list(im.getdata())
    for j in range(len(pix_val)):
        list_x.append(pix_val[j])
    list_val_happy.append(sum(list_x)/len(pix_val))
    
val_happy_pixel_avg = round(sum(list_val_happy)/len(list_val_happy), 2)

# Obtaining the average pixel value for test images in the class 'Happy'
list_test_happy = []

for i in range(len(os.listdir("Facial_emotion_images/test/happy/"))):
    list_x = []
    x = os.listdir("Facial_emotion_images/test/happy/")[i]
    im = Image.open("Facial_emotion_images/test/happy/"+x, 'r')
    pix_val = list(im.getdata())
    for j in range(len(pix_val)):
        list_x.append(pix_val[j])
    list_test_happy.append(sum(list_x)/len(pix_val))
    
test_happy_pixel_avg = round(sum(list_test_happy)/len(list_test_happy), 2)

# Obtaining the average pixel value for training images in the class 'Sad'
list_train_sad = []

for i in range(len(os.listdir("Facial_emotion_images/train/sad/"))):
    list_x = []
    x = os.listdir("Facial_emotion_images/train/sad/")[i]
    im = Image.open("Facial_emotion_images/train/sad/"+x, 'r')
    pix_val = list(im.getdata())
    for j in range(len(pix_val)):
        list_x.append(pix_val[j])
    list_train_sad.append(sum(list_x)/len(pix_val))
    
train_sad_pixel_avg = round(sum(list_train_sad)/len(list_train_sad), 2)

# Obtaining the average pixel value for validation images in the class 'Sad'
list_val_sad = []

for i in range(len(os.listdir("Facial_emotion_images/validation/sad/"))):
    list_x = []
    x = os.listdir("Facial_emotion_images/validation/sad/")[i]
    im = Image.open("Facial_emotion_images/validation/sad/"+x, 'r')
    pix_val = list(im.getdata())
    for j in range(len(pix_val)):
        list_x.append(pix_val[j])
    list_val_sad.append(sum(list_x)/len(pix_val))
    
val_sad_pixel_avg = round(sum(list_val_sad)/len(list_val_sad), 2)

# Obtaining the average pixel value for test images in the class 'Sad'
list_test_sad = []

for i in range(len(os.listdir("Facial_emotion_images/test/sad/"))):
    list_x = []
    x = os.listdir("Facial_emotion_images/test/sad/")[i]
    im = Image.open("Facial_emotion_images/test/sad/"+x, 'r')
    pix_val = list(im.getdata())
    for j in range(len(pix_val)):
        list_x.append(pix_val[j])
    list_test_sad.append(sum(list_x)/len(pix_val))
    
test_sad_pixel_avg = round(sum(list_test_sad)/len(list_test_sad), 2)

# Obtaining the average pixel value for training images in the class 'Neutral'
list_train_neutral = []

for i in range(len(os.listdir("Facial_emotion_images/train/neutral/"))):
    list_x = []
    x = os.listdir("Facial_emotion_images/train/neutral/")[i]
    im = Image.open("Facial_emotion_images/train/neutral/"+x, 'r')
    pix_val = list(im.getdata())
    for j in range(len(pix_val)):
        list_x.append(pix_val[j])
    list_train_neutral.append(sum(list_x)/len(pix_val))
    
train_neutral_pixel_avg = round(sum(list_train_neutral)/len(list_train_neutral), 2)

# Obtaining the average pixel value for validation images in the class 'Neutral'
list_val_neutral = []

for i in range(len(os.listdir("Facial_emotion_images/validation/neutral/"))):
    list_x = []
    x = os.listdir("Facial_emotion_images/validation/neutral/")[i]
    im = Image.open("Facial_emotion_images/validation/neutral/"+x, 'r')
    pix_val = list(im.getdata())
    for j in range(len(pix_val)):
        list_x.append(pix_val[j])
    list_val_neutral.append(sum(list_x)/len(pix_val))
    
val_neutral_pixel_avg = round(sum(list_val_neutral)/len(list_val_neutral), 2)

# Obtaining the average pixel value for test images in the class 'Neutral'
list_test_neutral = []

for i in range(len(os.listdir("Facial_emotion_images/test/neutral/"))):
    list_x = []
    x = os.listdir("Facial_emotion_images/test/neutral/")[i]
    im = Image.open("Facial_emotion_images/test/neutral/"+x, 'r')
    pix_val = list(im.getdata())
    for j in range(len(pix_val)):
        list_x.append(pix_val[j])
    list_test_neutral.append(sum(list_x)/len(pix_val))
    
test_neutral_pixel_avg = round(sum(list_test_neutral)/len(list_test_neutral), 2)

# Obtaining the average pixel value for training images in the class 'Surprise'
list_train_surprise = []

for i in range(len(os.listdir("Facial_emotion_images/train/surprise/"))):
    list_x = []
    x = os.listdir("Facial_emotion_images/train/surprise/")[i]
    im = Image.open("Facial_emotion_images/train/surprise/"+x, 'r')
    pix_val = list(im.getdata())
    for j in range(len(pix_val)):
        list_x.append(pix_val[j])
    list_train_surprise.append(sum(list_x)/len(pix_val))
    
train_surprise_pixel_avg = round(sum(list_train_surprise)/len(list_train_surprise), 2)

# Obtaining the average pixel value for validation images in the class 'Surprise'
list_val_surprise = []
for i in range(len(os.listdir("Facial_emotion_images/validation/surprise/"))):
    list_x = []
    x = os.listdir("Facial_emotion_images/validation/surprise/")[i]
    im = Image.open("Facial_emotion_images/validation/surprise/"+x, 'r')
    pix_val = list(im.getdata())
    for j in range(len(pix_val)):
        list_x.append(pix_val[j])
    list_val_surprise.append(sum(list_x)/len(pix_val))
    
val_surprise_pixel_avg = round(sum(list_val_surprise)/len(list_val_surprise), 2)

# Obtaining the average pixel value for test images in the class 'Surprise'
list_test_surprise = []

for i in range(len(os.listdir("Facial_emotion_images/test/surprise/"))):
    list_x = []
    x = os.listdir("Facial_emotion_images/test/surprise/")[i]
    im = Image.open("Facial_emotion_images/test/surprise/"+x, 'r')
    pix_val = list(im.getdata())
    for j in range(len(pix_val)):
        list_x.append(pix_val[j])
    list_test_surprise.append(sum(list_x)/len(pix_val))
    
test_surprise_pixel_avg = round(sum(list_test_surprise)/len(list_test_surprise), 2)

# creating dictionary containing average pixel values by class
dict_pixel_avg = {
    "Emotion": ["Happy", "Sad", "Neutral", "Surprise"],
    "Train": [train_happy_pixel_avg, train_sad_pixel_avg, train_neutral_pixel_avg, train_surprise_pixel_avg],
    "Validate": [val_happy_pixel_avg, val_sad_pixel_avg, val_neutral_pixel_avg, val_surprise_pixel_avg],
    "Test": [test_happy_pixel_avg, test_sad_pixel_avg, test_neutral_pixel_avg, test_surprise_pixel_avg]}

# converting dictionary to dataframe
df_pixel_avg = pd.DataFrame.from_dict(dict_pixel_avg)
df_pixel_avg
Out[22]:
See Full Dataframe in Mito
Emotion Train Validate Test
0 Happy 128.92 129.27 134.07
1 Sad 121.10 120.25 125.68
2 Neutral 124.09 123.92 127.68
3 Surprise 145.78 148.32 144.59
In [23]:
# plotting pixel averages for training, validation and test images

df_pixel_avg.groupby("Emotion", sort=False).mean().plot(kind='bar', figsize=(10,5), 
                            title="PIXEL AVERAGES FOR TRAINING, VALIDATION and TEST IMAGES", 
                            ylabel="Pixel Averages", rot=0, fontsize=12, width=0.9, colormap="Pastel2", 
                            edgecolor='black')
plt.legend(loc=(1.01, 0.5))
plt.show()

Observations: Pixel Values

  • In grayscale, a value of 255 indicates white while a value of 0 indicates black.
  • Consistent across training, validation, and test data sets, images in the surprise class have a higher average pixel value than images across the other three classes. In other words, surprise images are consistently brighter/lighter than happy, sad, and neutral images. Perhaps this is due to mouths being open more consistently and white teeth being exposed, as well as eyes being open wider and therefore more white being visible.
  • As surprise is the least represented class across training and validation data sets, perhaps this is another unique characteristic that will help differentiate it from the other three classes despite making up a smaller percentage of total images on which to train.
  • Across training, validation, and test data sets, images in the sad class have a lower average pixel value than images across the other three classes. In other words, sad images are consistently darker than happy, neutral, and surprise images.
  • It will be interesting to see if average pixel value can help our models more easily learn the sad and surprise images. The confusion matrix will show us how often sad images and surprise images are confused with one another.
  • Also interesting to note, while the sad and neutral images are the most similar visually (in terms of features), they are also the most similar when it comes to pixel values. Again, a look at the final confusion matrix will show us whether or not the two are more likely to be confused with one another.

Note:
Data pre-processing and augmentation will take place during the creation of data loaders. When ImageDataGenerator objects are instantiated, a range of processes can and will be applied, sometimes to varying degrees, depending on the model being created and trained. Some process/augmentation operations include the following:

  • rotation_range allows us to provide a degree range for random rotations of images. This helps address the issue of faces in the training images being tilted in different directions.
  • height_shift_range allows us to shift the image up and down.
  • width_shift_range allows us to shift the image left and right.
  • brightness_range allows us to address the wide range in pixel values from one image to the next. A number smaller than one makes an image darker, and a number larger than one makes an image lighter.
  • shear_range allows us to shear angle in a counter-clockwise direction.
  • zoom_range allows us to zoom in or out, essentially randomly cropping the images.
  • horizontal_flip allows us to flip the training image so it is a mirror image of itself. An image facing left will now face right, etc.
  • rescale is our opportunity to normalize the input image from a tensor filled with numbers from 0 to 255, down to a tensor of numbers ranging from 0 to 1.

While creating our data sets via flow_from_directory, we have an opportunity to set class_mode to 'categorical', which will essentially one-hot-encode our classes. The classes themselves are then defined as 'happy,' 'sad,' 'neutral,' and 'surprise.' This allows us to set our loss to categorical_crossentropy, which itself is used for multi-class classification where each image (in our case) belongs to a single class.

Creating Data Loaders¶

Creating data loaders that we will use as inputs to our initial neural networks. We will create separate data loaders for color_modes grayscale and RGB so we can compare the results. An image that is grayscale has only 1 channel, with pixel values ranging from 0 to 255, while an RGB image has 3 channels, with each pixel having a value for red, green, and blue. Images that are RGB are therefore more complex for a neural network to process.

In [24]:
batch_size  = 32

# Creating ImageDataGenerator objects for grayscale colormode 
datagen_train_grayscale = ImageDataGenerator(horizontal_flip = True, 
                                             brightness_range = (0.,2.),
                                             rescale = 1./255, 
                                             shear_range = 0.3)

datagen_validation_grayscale = ImageDataGenerator(horizontal_flip = True, 
                                                  brightness_range = (0.,2.),
                                                  rescale = 1./255, 
                                                  shear_range = 0.3)

datagen_test_grayscale = ImageDataGenerator(horizontal_flip = True, 
                                            brightness_range = (0.,2.),
                                            rescale = 1./255, 
                                            shear_range = 0.3)


# Creating ImageDataGenerator objects for RGB colormode
datagen_train_rgb = ImageDataGenerator(horizontal_flip = True, 
                                       brightness_range = (0.,2.),
                                       rescale = 1./255, 
                                       shear_range = 0.3)

datagen_validation_rgb = ImageDataGenerator(horizontal_flip = True, 
                                            brightness_range = (0.,2.),
                                            rescale = 1./255, 
                                            shear_range = 0.3)

datagen_test_rgb = ImageDataGenerator(horizontal_flip = True, 
                                      brightness_range = (0.,2.),
                                      rescale = 1./255, 
                                      shear_range = 0.3)



# Creating train, validation, and test sets for grayscale colormode

print("Grayscale Images")

train_set_grayscale = datagen_train_grayscale.flow_from_directory(dir_train,
                        target_size = (img_size, img_size),
                        color_mode = "grayscale",
                        batch_size = batch_size,
                        class_mode = 'categorical',
                        classes = ['happy', 'sad', 'neutral', 'surprise'],
                        seed = 42,
                        shuffle = True)

val_set_grayscale = datagen_validation_grayscale.flow_from_directory(dir_validation,
                        target_size = (img_size, img_size),
                        color_mode = "grayscale",
                        batch_size = batch_size,
                        class_mode = 'categorical',
                        classes = ['happy', 'sad', 'neutral', 'surprise'],
                        seed = 42,
                        shuffle = False)

test_set_grayscale = datagen_test_grayscale.flow_from_directory(dir_test,
                        target_size = (img_size, img_size),
                        color_mode = "grayscale",
                        batch_size = batch_size,
                        class_mode = 'categorical',
                        classes = ['happy', 'sad', 'neutral', 'surprise'],
                        seed = 42,
                        shuffle = False)



# Creating train, validation, and test sets for RGB colormode

print("\nColor Images")

train_set_rgb = datagen_train_rgb.flow_from_directory(dir_train,
                        target_size = (img_size, img_size),
                        color_mode = "rgb",
                        batch_size = batch_size,
                        class_mode = 'categorical',
                        classes = ['happy', 'sad', 'neutral', 'surprise'],  
                        seed = 42,
                        shuffle = True)

val_set_rgb = datagen_validation_rgb.flow_from_directory(dir_validation,
                        target_size = (img_size, img_size),
                        color_mode = "rgb",
                        batch_size = batch_size,
                        class_mode = 'categorical',
                        classes = ['happy', 'sad', 'neutral', 'surprise'],
                        seed = 42,
                        shuffle = False)

test_set_rgb = datagen_test_rgb.flow_from_directory(dir_test,
                        target_size = (img_size, img_size),
                        color_mode = "rgb",
                        batch_size = batch_size,
                        class_mode = 'categorical',
                        classes = ['happy', 'sad', 'neutral', 'surprise'],
                        seed = 42,
                        shuffle = False)
Grayscale Images
Found 15109 images belonging to 4 classes.
Found 4977 images belonging to 4 classes.
Found 128 images belonging to 4 classes.

Color Images
Found 15109 images belonging to 4 classes.
Found 4977 images belonging to 4 classes.
Found 128 images belonging to 4 classes.

Note:
Data augmentation performed on the data for these initial models includes horizontal_flip, brightness_range, rescale, and shear_range.

Model Building¶

A Note About Neural Networks:
The best algorithmic tools we have available to us for processing images are neural networks. In particular, convolutional neural networks (CNN) have significant advantages over standard artificial neural networks (ANN).

While image classification utilizing ANNs is possible, there are some drawbacks:

  • Translational Invariance: ANNs are not translation invariant, meaning that the location of objects within the image is learned along with the object itself. If the object is located in different areas of the image, varying from image to image, the ANN will likely produce inconsistent results.
  • Spacial Invariance: ANNs are not spatial invariant, meaning that once the image matrix is converted/flattened into an array, they lose spatial information about the image. In reality, nearby pixels within an image should be more strongly related to one another, but an ANN does not leverage this information.
  • Feature Extraction: ANNs give similar importance to each pixel within an image, meaning that they are learning the background of the image to the same degree that they are learning the object within the image. If the background changes from image to image, the ANN will have a difficult time learning that the object itself is the same despite what is going on in the background of the image.
  • Computational Expense: ANNs need input images to be flattened into an array of pixel values, and as the input images get larger and the number of hidden layers increases, the total number of trainable parameters balloons considerably.

On the other hand, through the use of convolutional and pooling layers, CNNs are translationally and spatially invariant. They are able to understand that the location of an object within an image is not important, nor is the background of the image itself. CNNs, through the use of their convolutional layers, are also better able to extract important features of an object within an image. Finally, CNNs take advantage of weight sharing, as the same filters are applied to each area of the image. This reduces the number of weights that need to be learned through backpropagation, thereby minimizing the number of trainable parameters and reducing computational expense.

Taking all of this into account, we will proceed with the development of CNN models to pursue our objectives.

Model 1.1: Base Neural Network (Grayscale)¶

Note:
We will begin by building a simple CNN model to serve as a baseline for future models. The same model will be built with color_mode set to grayscale (with an input shape of 48,48,1) as well as color_mode set to RGB (with an input shape of 48,48,3). The models will then be compared to determine if one approach outperforms the other.

A baseline grayscale model is developed first. It consists of three convolutional blocks with relu activation, MaxPooling and a Dropout layer, followed by a single dense layer with 512 neurons, and a softmax classifier for multi-class classification. Total trainable parameters are 605,060.

In [25]:
# Creating a Sequential model
model_1_grayscale = Sequential()

# Convolutional Block #1
model_1_grayscale.add(Conv2D(64, (2, 2), input_shape = (48, 48, 1), activation='relu', padding = 'same'))
model_1_grayscale.add(MaxPooling2D(2, 2))
model_1_grayscale.add(Dropout(0.2))

# Convolutional Block #2
model_1_grayscale.add(Conv2D(32, (2, 2), activation='relu', padding = 'same'))
model_1_grayscale.add(MaxPooling2D(2, 2))
model_1_grayscale.add(Dropout(0.2))

# Convolutional Block #3
model_1_grayscale.add(Conv2D(32, (2, 2), activation='relu', padding = 'same'))
model_1_grayscale.add(MaxPooling2D(2, 2))
model_1_grayscale.add(Dropout(0.2))

# Flatten layer
model_1_grayscale.add(Flatten())

# Dense layer
model_1_grayscale.add(Dense(512, activation = 'relu'))

# Classifier
model_1_grayscale.add(Dense(4, activation = 'softmax'))

model_1_grayscale.summary()
Metal device set to: Apple M1 Pro
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 conv2d (Conv2D)             (None, 48, 48, 64)        320       
                                                                 
 max_pooling2d (MaxPooling2D  (None, 24, 24, 64)       0         
 )                                                               
                                                                 
 dropout (Dropout)           (None, 24, 24, 64)        0         
                                                                 
 conv2d_1 (Conv2D)           (None, 24, 24, 32)        8224      
                                                                 
 max_pooling2d_1 (MaxPooling  (None, 12, 12, 32)       0         
 2D)                                                             
                                                                 
 dropout_1 (Dropout)         (None, 12, 12, 32)        0         
                                                                 
 conv2d_2 (Conv2D)           (None, 12, 12, 32)        4128      
                                                                 
 max_pooling2d_2 (MaxPooling  (None, 6, 6, 32)         0         
 2D)                                                             
                                                                 
 dropout_2 (Dropout)         (None, 6, 6, 32)          0         
                                                                 
 flatten (Flatten)           (None, 1152)              0         
                                                                 
 dense (Dense)               (None, 512)               590336    
                                                                 
 dense_1 (Dense)             (None, 4)                 2052      
                                                                 
=================================================================
Total params: 605,060
Trainable params: 605,060
Non-trainable params: 0
_________________________________________________________________

Compiling and Training the Model¶

In [26]:
# Creating a checkpoint which saves model weights from the best epoch
checkpoint = ModelCheckpoint("./model_1_grayscale.h5", monitor='val_accuracy', verbose=1, save_best_only=True, mode='auto')

# Initiates early stopping if validation loss does not continue to improve
early_stopping = EarlyStopping(monitor = 'val_loss',
                              min_delta = 0,
                              patience = 5,
                              verbose = 1,
                              restore_best_weights = True)

# Initiates reduced learning rate if validation loss does not continue to improve
reduce_learningrate = ReduceLROnPlateau(monitor = 'val_loss',
                                        factor = 0.2,
                                        patience = 3,
                                        verbose = 1,
                                        min_delta = 0.0001)

callbacks_list = [checkpoint, early_stopping, reduce_learningrate]
In [27]:
# Compiling model with optimizer set to Adam, loss set to categorical_crossentropy, and metrics set to accuracy
model_1_grayscale.compile(optimizer = Adam(learning_rate = 0.001), loss = 'categorical_crossentropy', metrics = ['accuracy'])
In [28]:
# Fitting model with epochs set to 100
history_1_grayscale = model_1_grayscale.fit(train_set_grayscale, validation_data = val_set_grayscale, epochs = 100, callbacks = callbacks_list)
Epoch 1/100
472/473 [============================>.] - ETA: 0s - loss: 1.3553 - accuracy: 0.3082
Epoch 1: val_accuracy improved from -inf to 0.40446, saving model to ./model_1_grayscale.h5
473/473 [==============================] - 24s 49ms/step - loss: 1.3552 - accuracy: 0.3080 - val_loss: 1.2486 - val_accuracy: 0.4045 - lr: 0.0010
Epoch 2/100
472/473 [============================>.] - ETA: 0s - loss: 1.1951 - accuracy: 0.4636
Epoch 2: val_accuracy improved from 0.40446 to 0.53546, saving model to ./model_1_grayscale.h5
473/473 [==============================] - 19s 41ms/step - loss: 1.1945 - accuracy: 0.4639 - val_loss: 1.0981 - val_accuracy: 0.5355 - lr: 0.0010
Epoch 3/100
473/473 [==============================] - ETA: 0s - loss: 1.1111 - accuracy: 0.5118
Epoch 3: val_accuracy did not improve from 0.53546
473/473 [==============================] - 20s 43ms/step - loss: 1.1111 - accuracy: 0.5118 - val_loss: 1.0725 - val_accuracy: 0.5347 - lr: 0.0010
Epoch 4/100
472/473 [============================>.] - ETA: 0s - loss: 1.0661 - accuracy: 0.5354
Epoch 4: val_accuracy improved from 0.53546 to 0.57645, saving model to ./model_1_grayscale.h5
473/473 [==============================] - 20s 43ms/step - loss: 1.0658 - accuracy: 0.5358 - val_loss: 1.0028 - val_accuracy: 0.5765 - lr: 0.0010
Epoch 5/100
472/473 [============================>.] - ETA: 0s - loss: 1.0311 - accuracy: 0.5511
Epoch 5: val_accuracy improved from 0.57645 to 0.59755, saving model to ./model_1_grayscale.h5
473/473 [==============================] - 20s 41ms/step - loss: 1.0311 - accuracy: 0.5512 - val_loss: 0.9697 - val_accuracy: 0.5975 - lr: 0.0010
Epoch 6/100
472/473 [============================>.] - ETA: 0s - loss: 0.9950 - accuracy: 0.5684
Epoch 6: val_accuracy improved from 0.59755 to 0.60076, saving model to ./model_1_grayscale.h5
473/473 [==============================] - 20s 41ms/step - loss: 0.9951 - accuracy: 0.5685 - val_loss: 0.9454 - val_accuracy: 0.6008 - lr: 0.0010
Epoch 7/100
473/473 [==============================] - ETA: 0s - loss: 0.9724 - accuracy: 0.5772
Epoch 7: val_accuracy did not improve from 0.60076
473/473 [==============================] - 20s 42ms/step - loss: 0.9724 - accuracy: 0.5772 - val_loss: 0.9970 - val_accuracy: 0.5859 - lr: 0.0010
Epoch 8/100
473/473 [==============================] - ETA: 0s - loss: 0.9494 - accuracy: 0.5920
Epoch 8: val_accuracy improved from 0.60076 to 0.62166, saving model to ./model_1_grayscale.h5
473/473 [==============================] - 20s 41ms/step - loss: 0.9494 - accuracy: 0.5920 - val_loss: 0.9020 - val_accuracy: 0.6217 - lr: 0.0010
Epoch 9/100
473/473 [==============================] - ETA: 0s - loss: 0.9273 - accuracy: 0.5985
Epoch 9: val_accuracy improved from 0.62166 to 0.63030, saving model to ./model_1_grayscale.h5
473/473 [==============================] - 21s 44ms/step - loss: 0.9273 - accuracy: 0.5985 - val_loss: 0.8958 - val_accuracy: 0.6303 - lr: 0.0010
Epoch 10/100
473/473 [==============================] - ETA: 0s - loss: 0.9179 - accuracy: 0.6088
Epoch 10: val_accuracy did not improve from 0.63030
473/473 [==============================] - 19s 40ms/step - loss: 0.9179 - accuracy: 0.6088 - val_loss: 0.9093 - val_accuracy: 0.6162 - lr: 0.0010
Epoch 11/100
472/473 [============================>.] - ETA: 0s - loss: 0.8932 - accuracy: 0.6189
Epoch 11: val_accuracy improved from 0.63030 to 0.63814, saving model to ./model_1_grayscale.h5
473/473 [==============================] - 19s 41ms/step - loss: 0.8934 - accuracy: 0.6188 - val_loss: 0.8742 - val_accuracy: 0.6381 - lr: 0.0010
Epoch 12/100
473/473 [==============================] - ETA: 0s - loss: 0.8795 - accuracy: 0.6289
Epoch 12: val_accuracy improved from 0.63814 to 0.64517, saving model to ./model_1_grayscale.h5
473/473 [==============================] - 20s 42ms/step - loss: 0.8795 - accuracy: 0.6289 - val_loss: 0.8668 - val_accuracy: 0.6452 - lr: 0.0010
Epoch 13/100
472/473 [============================>.] - ETA: 0s - loss: 0.8640 - accuracy: 0.6320
Epoch 13: val_accuracy did not improve from 0.64517
473/473 [==============================] - 20s 41ms/step - loss: 0.8645 - accuracy: 0.6319 - val_loss: 0.8784 - val_accuracy: 0.6319 - lr: 0.0010
Epoch 14/100
472/473 [============================>.] - ETA: 0s - loss: 0.8565 - accuracy: 0.6369
Epoch 14: val_accuracy did not improve from 0.64517
473/473 [==============================] - 19s 41ms/step - loss: 0.8562 - accuracy: 0.6369 - val_loss: 0.8624 - val_accuracy: 0.6448 - lr: 0.0010
Epoch 15/100
472/473 [============================>.] - ETA: 0s - loss: 0.8432 - accuracy: 0.6456
Epoch 15: val_accuracy did not improve from 0.64517
473/473 [==============================] - 20s 43ms/step - loss: 0.8432 - accuracy: 0.6454 - val_loss: 0.8713 - val_accuracy: 0.6416 - lr: 0.0010
Epoch 16/100
472/473 [============================>.] - ETA: 0s - loss: 0.8317 - accuracy: 0.6504
Epoch 16: val_accuracy did not improve from 0.64517
473/473 [==============================] - 19s 41ms/step - loss: 0.8318 - accuracy: 0.6502 - val_loss: 0.8629 - val_accuracy: 0.6418 - lr: 0.0010
Epoch 17/100
472/473 [============================>.] - ETA: 0s - loss: 0.8108 - accuracy: 0.6595
Epoch 17: val_accuracy improved from 0.64517 to 0.66044, saving model to ./model_1_grayscale.h5
473/473 [==============================] - 20s 41ms/step - loss: 0.8105 - accuracy: 0.6597 - val_loss: 0.8239 - val_accuracy: 0.6604 - lr: 0.0010
Epoch 18/100
473/473 [==============================] - ETA: 0s - loss: 0.8060 - accuracy: 0.6621
Epoch 18: val_accuracy improved from 0.66044 to 0.66486, saving model to ./model_1_grayscale.h5
473/473 [==============================] - 20s 41ms/step - loss: 0.8060 - accuracy: 0.6621 - val_loss: 0.8214 - val_accuracy: 0.6649 - lr: 0.0010
Epoch 19/100
473/473 [==============================] - ETA: 0s - loss: 0.7970 - accuracy: 0.6670
Epoch 19: val_accuracy improved from 0.66486 to 0.67008, saving model to ./model_1_grayscale.h5
473/473 [==============================] - 20s 41ms/step - loss: 0.7970 - accuracy: 0.6670 - val_loss: 0.8127 - val_accuracy: 0.6701 - lr: 0.0010
Epoch 20/100
473/473 [==============================] - ETA: 0s - loss: 0.7853 - accuracy: 0.6721
Epoch 20: val_accuracy did not improve from 0.67008
473/473 [==============================] - 19s 40ms/step - loss: 0.7853 - accuracy: 0.6721 - val_loss: 0.8171 - val_accuracy: 0.6604 - lr: 0.0010
Epoch 21/100
472/473 [============================>.] - ETA: 0s - loss: 0.7768 - accuracy: 0.6808
Epoch 21: val_accuracy did not improve from 0.67008
473/473 [==============================] - 20s 42ms/step - loss: 0.7769 - accuracy: 0.6805 - val_loss: 0.8197 - val_accuracy: 0.6667 - lr: 0.0010
Epoch 22/100
473/473 [==============================] - ETA: 0s - loss: 0.7725 - accuracy: 0.6793
Epoch 22: val_accuracy did not improve from 0.67008

Epoch 22: ReduceLROnPlateau reducing learning rate to 0.00020000000949949026.
473/473 [==============================] - 19s 41ms/step - loss: 0.7725 - accuracy: 0.6793 - val_loss: 0.8669 - val_accuracy: 0.6436 - lr: 0.0010
Epoch 23/100
472/473 [============================>.] - ETA: 0s - loss: 0.7192 - accuracy: 0.7019
Epoch 23: val_accuracy improved from 0.67008 to 0.67852, saving model to ./model_1_grayscale.h5
473/473 [==============================] - 20s 43ms/step - loss: 0.7189 - accuracy: 0.7020 - val_loss: 0.7922 - val_accuracy: 0.6785 - lr: 2.0000e-04
Epoch 24/100
473/473 [==============================] - ETA: 0s - loss: 0.7088 - accuracy: 0.7083
Epoch 24: val_accuracy improved from 0.67852 to 0.68254, saving model to ./model_1_grayscale.h5
473/473 [==============================] - 19s 41ms/step - loss: 0.7088 - accuracy: 0.7083 - val_loss: 0.7970 - val_accuracy: 0.6825 - lr: 2.0000e-04
Epoch 25/100
472/473 [============================>.] - ETA: 0s - loss: 0.6970 - accuracy: 0.7094
Epoch 25: val_accuracy did not improve from 0.68254
473/473 [==============================] - 19s 41ms/step - loss: 0.6966 - accuracy: 0.7097 - val_loss: 0.8051 - val_accuracy: 0.6753 - lr: 2.0000e-04
Epoch 26/100
473/473 [==============================] - ETA: 0s - loss: 0.6992 - accuracy: 0.7145
Epoch 26: val_accuracy did not improve from 0.68254

Epoch 26: ReduceLROnPlateau reducing learning rate to 4.0000001899898055e-05.
473/473 [==============================] - 19s 41ms/step - loss: 0.6992 - accuracy: 0.7145 - val_loss: 0.7965 - val_accuracy: 0.6767 - lr: 2.0000e-04
Epoch 27/100
473/473 [==============================] - ETA: 0s - loss: 0.6865 - accuracy: 0.7171
Epoch 27: val_accuracy did not improve from 0.68254
473/473 [==============================] - 19s 41ms/step - loss: 0.6865 - accuracy: 0.7171 - val_loss: 0.7953 - val_accuracy: 0.6755 - lr: 4.0000e-05
Epoch 28/100
472/473 [============================>.] - ETA: 0s - loss: 0.6791 - accuracy: 0.7192
Epoch 28: val_accuracy did not improve from 0.68254
473/473 [==============================] - 19s 41ms/step - loss: 0.6788 - accuracy: 0.7192 - val_loss: 0.7843 - val_accuracy: 0.6819 - lr: 4.0000e-05
Epoch 29/100
473/473 [==============================] - ETA: 0s - loss: 0.6823 - accuracy: 0.7192
Epoch 29: val_accuracy improved from 0.68254 to 0.68354, saving model to ./model_1_grayscale.h5
473/473 [==============================] - 18s 39ms/step - loss: 0.6823 - accuracy: 0.7192 - val_loss: 0.7811 - val_accuracy: 0.6835 - lr: 4.0000e-05
Epoch 30/100
472/473 [============================>.] - ETA: 0s - loss: 0.6826 - accuracy: 0.7222
Epoch 30: val_accuracy did not improve from 0.68354
473/473 [==============================] - 18s 39ms/step - loss: 0.6825 - accuracy: 0.7222 - val_loss: 0.7958 - val_accuracy: 0.6791 - lr: 4.0000e-05
Epoch 31/100
473/473 [==============================] - ETA: 0s - loss: 0.6689 - accuracy: 0.7239
Epoch 31: val_accuracy improved from 0.68354 to 0.68475, saving model to ./model_1_grayscale.h5
473/473 [==============================] - 22s 47ms/step - loss: 0.6689 - accuracy: 0.7239 - val_loss: 0.7872 - val_accuracy: 0.6847 - lr: 4.0000e-05
Epoch 32/100
472/473 [============================>.] - ETA: 0s - loss: 0.6743 - accuracy: 0.7239
Epoch 32: val_accuracy did not improve from 0.68475

Epoch 32: ReduceLROnPlateau reducing learning rate to 8.000000525498762e-06.
473/473 [==============================] - 21s 44ms/step - loss: 0.6742 - accuracy: 0.7239 - val_loss: 0.7912 - val_accuracy: 0.6811 - lr: 4.0000e-05
Epoch 33/100
473/473 [==============================] - ETA: 0s - loss: 0.6665 - accuracy: 0.7237
Epoch 33: val_accuracy did not improve from 0.68475
473/473 [==============================] - 19s 40ms/step - loss: 0.6665 - accuracy: 0.7237 - val_loss: 0.7818 - val_accuracy: 0.6827 - lr: 8.0000e-06
Epoch 34/100
472/473 [============================>.] - ETA: 0s - loss: 0.6728 - accuracy: 0.7251
Epoch 34: val_accuracy improved from 0.68475 to 0.68676, saving model to ./model_1_grayscale.h5
Restoring model weights from the end of the best epoch: 29.
473/473 [==============================] - 20s 43ms/step - loss: 0.6730 - accuracy: 0.7251 - val_loss: 0.7828 - val_accuracy: 0.6868 - lr: 8.0000e-06
Epoch 34: early stopping
In [29]:
# Plotting the accuracies

plt.figure(figsize = (10, 5))
plt.plot(history_1_grayscale.history['accuracy'])
plt.plot(history_1_grayscale.history['val_accuracy'])
plt.title('Accuracy - Model 1 (Grayscale)')
plt.ylabel('Accuracy')
plt.xlabel('Epoch')
plt.legend(['Training', 'Validation'], loc='lower right')
plt.show()
In [30]:
# Plotting the losses

plt.figure(figsize = (10, 5))
plt.plot(history_1_grayscale.history['loss'])
plt.plot(history_1_grayscale.history['val_loss'])
plt.title('Loss - Model 1 (Grayscale)')
plt.ylabel('loss')
plt.xlabel('epoch')
plt.legend(['Training', 'Validation'], loc='upper right')
plt.show()

Evaluating the Model on the Test Set¶

In [31]:
# Evaluating the model's performance on the test set
accuracy = model_1_grayscale.evaluate(test_set_grayscale)
4/4 [==============================] - 0s 21ms/step - loss: 0.8219 - accuracy: 0.6484

Observations and Insights:
As constructed, our baseline grayscale model performs decently. After 29 epochs (best epoch), training accuracy stands at 0.72 and validation accuracy is 0.68. Training accuracy and loss continue to improve, while validation accuracy and loss begin to level off before early-stopping ends the training process. Accuracy on the test set is 0.65. A glance at the results, and the accuracy/loss graphs above, reveals a model that is overfitting and consequently has some room for improvement.

Training Validation Test
Grayscale Accuracy 0.72 0.68 0.65


Model 1.2: Base Neural Network (RGB)¶

Note:
This baseline model will contain the same architecture as the above grayscale model. Due to the input shape changing from 48,48,1 (grayscale) to 48,48,3 (rgb), the total trainable parameters have increased to 605,572.

In [32]:
# Creating a Sequential model
model_1_rgb = Sequential()

# Convolutional Block #1
model_1_rgb.add(Conv2D(64, (2, 2), input_shape = (48, 48, 3), activation='relu', padding = 'same'))
model_1_rgb.add(MaxPooling2D(2, 2))
model_1_rgb.add(Dropout(0.2))

# Convolutional Block #2
model_1_rgb.add(Conv2D(32, (2, 2), activation='relu', padding = 'same'))
model_1_rgb.add(MaxPooling2D(2, 2))
model_1_rgb.add(Dropout(0.2))

# Convolutional Block #3
model_1_rgb.add(Conv2D(32, (2, 2), activation='relu', padding = 'same'))
model_1_rgb.add(MaxPooling2D(2, 2))
model_1_rgb.add(Dropout(0.2))

# Flatten layer
model_1_rgb.add(Flatten())

# Dense layer
model_1_rgb.add(Dense(512, activation = 'relu'))

# Classifier
model_1_rgb.add(Dense(4, activation = 'softmax'))

model_1_rgb.summary()
Model: "sequential_1"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 conv2d_3 (Conv2D)           (None, 48, 48, 64)        832       
                                                                 
 max_pooling2d_3 (MaxPooling  (None, 24, 24, 64)       0         
 2D)                                                             
                                                                 
 dropout_3 (Dropout)         (None, 24, 24, 64)        0         
                                                                 
 conv2d_4 (Conv2D)           (None, 24, 24, 32)        8224      
                                                                 
 max_pooling2d_4 (MaxPooling  (None, 12, 12, 32)       0         
 2D)                                                             
                                                                 
 dropout_4 (Dropout)         (None, 12, 12, 32)        0         
                                                                 
 conv2d_5 (Conv2D)           (None, 12, 12, 32)        4128      
                                                                 
 max_pooling2d_5 (MaxPooling  (None, 6, 6, 32)         0         
 2D)                                                             
                                                                 
 dropout_5 (Dropout)         (None, 6, 6, 32)          0         
                                                                 
 flatten_1 (Flatten)         (None, 1152)              0         
                                                                 
 dense_2 (Dense)             (None, 512)               590336    
                                                                 
 dense_3 (Dense)             (None, 4)                 2052      
                                                                 
=================================================================
Total params: 605,572
Trainable params: 605,572
Non-trainable params: 0
_________________________________________________________________

Compiling and Training the Model¶

In [33]:
# Creating a checkpoint which saves model weights from the best epoch
checkpoint = ModelCheckpoint("./model_1_rgb.h5", monitor='val_accuracy', verbose=1, save_best_only=True, mode='auto')

# Initiates early stopping if validation loss does not continue to improve
early_stopping = EarlyStopping(monitor = 'val_loss',
                              min_delta = 0,
                              patience = 5,
                              verbose = 1,
                              restore_best_weights = True)

# Initiates reduced learning rate if validation loss does not continue to improve
reduce_learningrate = ReduceLROnPlateau(monitor = 'val_loss',
                                        factor = 0.2,
                                        patience = 3,
                                        verbose = 1,
                                        min_delta = 0.0001)

callbacks_list = [checkpoint, early_stopping, reduce_learningrate]
In [34]:
# Compiling model with optimizer set to Adam, loss set to categorical_crossentropy, and metrics set to accuracy
model_1_rgb.compile(optimizer = Adam(learning_rate = 0.001), loss = 'categorical_crossentropy', metrics = ['accuracy'])
In [35]:
# Fitting model with epochs set to 100
history_1_rgb = model_1_rgb.fit(train_set_rgb, validation_data = val_set_rgb, epochs = 100, callbacks = callbacks_list)
Epoch 1/100
472/473 [============================>.] - ETA: 0s - loss: 1.3419 - accuracy: 0.3287
Epoch 1: val_accuracy improved from -inf to 0.42958, saving model to ./model_1_rgb.h5
473/473 [==============================] - 31s 64ms/step - loss: 1.3417 - accuracy: 0.3291 - val_loss: 1.2416 - val_accuracy: 0.4296 - lr: 0.0010
Epoch 2/100
473/473 [==============================] - ETA: 0s - loss: 1.1906 - accuracy: 0.4685
Epoch 2: val_accuracy improved from 0.42958 to 0.53486, saving model to ./model_1_rgb.h5
473/473 [==============================] - 23s 49ms/step - loss: 1.1906 - accuracy: 0.4685 - val_loss: 1.1061 - val_accuracy: 0.5349 - lr: 0.0010
Epoch 3/100
472/473 [============================>.] - ETA: 0s - loss: 1.0946 - accuracy: 0.5112
Epoch 3: val_accuracy improved from 0.53486 to 0.56399, saving model to ./model_1_rgb.h5
473/473 [==============================] - 24s 50ms/step - loss: 1.0946 - accuracy: 0.5110 - val_loss: 1.0196 - val_accuracy: 0.5640 - lr: 0.0010
Epoch 4/100
473/473 [==============================] - ETA: 0s - loss: 1.0434 - accuracy: 0.5458
Epoch 4: val_accuracy improved from 0.56399 to 0.59494, saving model to ./model_1_rgb.h5
473/473 [==============================] - 24s 50ms/step - loss: 1.0434 - accuracy: 0.5458 - val_loss: 0.9597 - val_accuracy: 0.5949 - lr: 0.0010
Epoch 5/100
472/473 [============================>.] - ETA: 0s - loss: 0.9967 - accuracy: 0.5700
Epoch 5: val_accuracy improved from 0.59494 to 0.60981, saving model to ./model_1_rgb.h5
473/473 [==============================] - 24s 52ms/step - loss: 0.9967 - accuracy: 0.5701 - val_loss: 0.9334 - val_accuracy: 0.6098 - lr: 0.0010
Epoch 6/100
472/473 [============================>.] - ETA: 0s - loss: 0.9653 - accuracy: 0.5859
Epoch 6: val_accuracy improved from 0.60981 to 0.62186, saving model to ./model_1_rgb.h5
473/473 [==============================] - 24s 51ms/step - loss: 0.9653 - accuracy: 0.5859 - val_loss: 0.9016 - val_accuracy: 0.6219 - lr: 0.0010
Epoch 7/100
473/473 [==============================] - ETA: 0s - loss: 0.9419 - accuracy: 0.6026
Epoch 7: val_accuracy improved from 0.62186 to 0.63050, saving model to ./model_1_rgb.h5
473/473 [==============================] - 24s 51ms/step - loss: 0.9419 - accuracy: 0.6026 - val_loss: 0.8797 - val_accuracy: 0.6305 - lr: 0.0010
Epoch 8/100
472/473 [============================>.] - ETA: 0s - loss: 0.9108 - accuracy: 0.6107
Epoch 8: val_accuracy did not improve from 0.63050
473/473 [==============================] - 24s 51ms/step - loss: 0.9107 - accuracy: 0.6106 - val_loss: 0.8758 - val_accuracy: 0.6299 - lr: 0.0010
Epoch 9/100
472/473 [============================>.] - ETA: 0s - loss: 0.8963 - accuracy: 0.6189
Epoch 9: val_accuracy improved from 0.63050 to 0.63954, saving model to ./model_1_rgb.h5
473/473 [==============================] - 24s 50ms/step - loss: 0.8962 - accuracy: 0.6189 - val_loss: 0.8507 - val_accuracy: 0.6395 - lr: 0.0010
Epoch 10/100
472/473 [============================>.] - ETA: 0s - loss: 0.8727 - accuracy: 0.6261
Epoch 10: val_accuracy improved from 0.63954 to 0.64014, saving model to ./model_1_rgb.h5
473/473 [==============================] - 24s 51ms/step - loss: 0.8729 - accuracy: 0.6260 - val_loss: 0.8610 - val_accuracy: 0.6401 - lr: 0.0010
Epoch 11/100
473/473 [==============================] - ETA: 0s - loss: 0.8577 - accuracy: 0.6384
Epoch 11: val_accuracy improved from 0.64014 to 0.64597, saving model to ./model_1_rgb.h5
473/473 [==============================] - 24s 51ms/step - loss: 0.8577 - accuracy: 0.6384 - val_loss: 0.8408 - val_accuracy: 0.6460 - lr: 0.0010
Epoch 12/100
473/473 [==============================] - ETA: 0s - loss: 0.8406 - accuracy: 0.6428
Epoch 12: val_accuracy did not improve from 0.64597
473/473 [==============================] - 24s 51ms/step - loss: 0.8406 - accuracy: 0.6428 - val_loss: 0.8567 - val_accuracy: 0.6401 - lr: 0.0010
Epoch 13/100
472/473 [============================>.] - ETA: 0s - loss: 0.8291 - accuracy: 0.6538
Epoch 13: val_accuracy improved from 0.64597 to 0.64678, saving model to ./model_1_rgb.h5
473/473 [==============================] - 24s 51ms/step - loss: 0.8292 - accuracy: 0.6537 - val_loss: 0.8581 - val_accuracy: 0.6468 - lr: 0.0010
Epoch 14/100
473/473 [==============================] - ETA: 0s - loss: 0.8032 - accuracy: 0.6660
Epoch 14: val_accuracy improved from 0.64678 to 0.65984, saving model to ./model_1_rgb.h5
473/473 [==============================] - 24s 51ms/step - loss: 0.8032 - accuracy: 0.6660 - val_loss: 0.8278 - val_accuracy: 0.6598 - lr: 0.0010
Epoch 15/100
472/473 [============================>.] - ETA: 0s - loss: 0.8048 - accuracy: 0.6711
Epoch 15: val_accuracy did not improve from 0.65984
473/473 [==============================] - 24s 51ms/step - loss: 0.8044 - accuracy: 0.6711 - val_loss: 0.8348 - val_accuracy: 0.6488 - lr: 0.0010
Epoch 16/100
473/473 [==============================] - ETA: 0s - loss: 0.7877 - accuracy: 0.6728
Epoch 16: val_accuracy improved from 0.65984 to 0.66245, saving model to ./model_1_rgb.h5
473/473 [==============================] - 24s 50ms/step - loss: 0.7877 - accuracy: 0.6728 - val_loss: 0.8197 - val_accuracy: 0.6624 - lr: 0.0010
Epoch 17/100
473/473 [==============================] - ETA: 0s - loss: 0.7711 - accuracy: 0.6781
Epoch 17: val_accuracy improved from 0.66245 to 0.66848, saving model to ./model_1_rgb.h5
473/473 [==============================] - 24s 50ms/step - loss: 0.7711 - accuracy: 0.6781 - val_loss: 0.8042 - val_accuracy: 0.6685 - lr: 0.0010
Epoch 18/100
472/473 [============================>.] - ETA: 0s - loss: 0.7557 - accuracy: 0.6853
Epoch 18: val_accuracy improved from 0.66848 to 0.67410, saving model to ./model_1_rgb.h5
473/473 [==============================] - 25s 52ms/step - loss: 0.7553 - accuracy: 0.6855 - val_loss: 0.8002 - val_accuracy: 0.6741 - lr: 0.0010
Epoch 19/100
473/473 [==============================] - ETA: 0s - loss: 0.7483 - accuracy: 0.6945
Epoch 19: val_accuracy did not improve from 0.67410
473/473 [==============================] - 24s 51ms/step - loss: 0.7483 - accuracy: 0.6945 - val_loss: 0.8005 - val_accuracy: 0.6707 - lr: 0.0010
Epoch 20/100
472/473 [============================>.] - ETA: 0s - loss: 0.7383 - accuracy: 0.6952
Epoch 20: val_accuracy did not improve from 0.67410
473/473 [==============================] - 24s 51ms/step - loss: 0.7396 - accuracy: 0.6950 - val_loss: 0.7997 - val_accuracy: 0.6711 - lr: 0.0010
Epoch 21/100
472/473 [============================>.] - ETA: 0s - loss: 0.7222 - accuracy: 0.7033
Epoch 21: val_accuracy did not improve from 0.67410
473/473 [==============================] - 24s 51ms/step - loss: 0.7223 - accuracy: 0.7031 - val_loss: 0.8052 - val_accuracy: 0.6711 - lr: 0.0010
Epoch 22/100
472/473 [============================>.] - ETA: 0s - loss: 0.7082 - accuracy: 0.7097
Epoch 22: val_accuracy improved from 0.67410 to 0.67732, saving model to ./model_1_rgb.h5
473/473 [==============================] - 24s 51ms/step - loss: 0.7083 - accuracy: 0.7096 - val_loss: 0.8101 - val_accuracy: 0.6773 - lr: 0.0010
Epoch 23/100
472/473 [============================>.] - ETA: 0s - loss: 0.6996 - accuracy: 0.7115
Epoch 23: val_accuracy did not improve from 0.67732
473/473 [==============================] - 25s 52ms/step - loss: 0.6999 - accuracy: 0.7113 - val_loss: 0.7905 - val_accuracy: 0.6765 - lr: 0.0010
Epoch 24/100
473/473 [==============================] - ETA: 0s - loss: 0.6804 - accuracy: 0.7218
Epoch 24: val_accuracy improved from 0.67732 to 0.68234, saving model to ./model_1_rgb.h5
473/473 [==============================] - 24s 51ms/step - loss: 0.6804 - accuracy: 0.7218 - val_loss: 0.7789 - val_accuracy: 0.6823 - lr: 0.0010
Epoch 25/100
472/473 [============================>.] - ETA: 0s - loss: 0.6712 - accuracy: 0.7257
Epoch 25: val_accuracy did not improve from 0.68234
473/473 [==============================] - 24s 51ms/step - loss: 0.6713 - accuracy: 0.7257 - val_loss: 0.8476 - val_accuracy: 0.6673 - lr: 0.0010
Epoch 26/100
473/473 [==============================] - ETA: 0s - loss: 0.6590 - accuracy: 0.7343
Epoch 26: val_accuracy did not improve from 0.68234
473/473 [==============================] - 24s 51ms/step - loss: 0.6590 - accuracy: 0.7343 - val_loss: 0.8357 - val_accuracy: 0.6669 - lr: 0.0010
Epoch 27/100
473/473 [==============================] - ETA: 0s - loss: 0.6475 - accuracy: 0.7387
Epoch 27: val_accuracy did not improve from 0.68234

Epoch 27: ReduceLROnPlateau reducing learning rate to 0.00020000000949949026.
473/473 [==============================] - 24s 51ms/step - loss: 0.6475 - accuracy: 0.7387 - val_loss: 0.8530 - val_accuracy: 0.6602 - lr: 0.0010
Epoch 28/100
473/473 [==============================] - ETA: 0s - loss: 0.5849 - accuracy: 0.7625
Epoch 28: val_accuracy improved from 0.68234 to 0.68495, saving model to ./model_1_rgb.h5
473/473 [==============================] - 24s 52ms/step - loss: 0.5849 - accuracy: 0.7625 - val_loss: 0.7955 - val_accuracy: 0.6850 - lr: 2.0000e-04
Epoch 29/100
472/473 [============================>.] - ETA: 0s - loss: 0.5771 - accuracy: 0.7661
Epoch 29: val_accuracy did not improve from 0.68495
Restoring model weights from the end of the best epoch: 24.
473/473 [==============================] - 24s 50ms/step - loss: 0.5769 - accuracy: 0.7661 - val_loss: 0.8159 - val_accuracy: 0.6833 - lr: 2.0000e-04
Epoch 29: early stopping
In [36]:
# Plotting the accuracies

plt.figure(figsize = (10, 5))
plt.plot(history_1_rgb.history['accuracy'])
plt.plot(history_1_rgb.history['val_accuracy'])
plt.title('Accuracy - Model 1 (RGB)')
plt.ylabel('Accuracy')
plt.xlabel('Epoch')
plt.legend(['Training', 'Validation'], loc='lower right')
plt.show()
In [37]:
# Plotting the losses

plt.figure(figsize = (10, 5))
plt.plot(history_1_rgb.history['loss'])
plt.plot(history_1_rgb.history['val_loss'])
plt.title('Loss - Model 1 (RGB)')
plt.ylabel('loss')
plt.xlabel('epoch')
plt.legend(['Training', 'Validation'], loc='upper right')
plt.show()

Evaluating the Model on the Test Set¶

In [38]:
# Evaluating the model's performance on the test set
accuracy = model_1_rgb.evaluate(test_set_rgb)
4/4 [==============================] - 0s 29ms/step - loss: 0.8046 - accuracy: 0.6328

Observations and Insights:
As constructed, our baseline RGB model also performs decently. After 24 epochs (best epoch), training accuracy stands at 0.72 and validation accuracy is 0.68. Training accuracy and loss continue to improve, while validation accuracy and loss begin to level off before early-stopping ends the training process. Accuracy on the test set is 0.63.

Our baseline grayscale and RGB models perform similarly across all metrics. Overall, both models underfit the data for 10-15 epochs, likely due to the addition of Dropout layers in the model architecture, after which the models begin to overfit the data, performing similarly. Perhaps a slight edge to the grayscale model for performing better on the test set with a smaller number of trainable parameters, making it computationally less expensive when scaled.

Training Validation Test
Grayscale Accuracy 0.72 0.68 0.65
RGB Accuracy 0.72 0.68 0.63


In [39]:
# Plotting the accuracies

plt.figure(figsize = (10, 5))
plt.plot(history_1_grayscale.history['accuracy'])
plt.plot(history_1_grayscale.history['val_accuracy'])
plt.plot(history_1_rgb.history['accuracy'])
plt.plot(history_1_rgb.history['val_accuracy'])
plt.title('Accuracy - Model 1 (Grayscale & RGB)')
plt.ylabel('Accuracy')
plt.xlabel('Epoch')
plt.legend(['Training Accuracy (Grayscale)', 'Validation Accuracy (Grayscale)', 
            'Training Accuracy (RGB)', 'Validation Accuracy (RGB)'], loc='lower right')
plt.show()

Model 2.1: 2nd Generation (Grayscale)¶

Note:
We will now build a slightly deeper model to see if we can improve performance. Similar to our baseline models, we will train this model with color_modes of grayscale and RGB so we can compare performance.

The architecture of our second model is comprised of 4 convolutional blocks with relu activation, BatchNormalization, a LeakyReLu layer, and MaxPooling, followed by a dense layer with 512 neurons, another dense layer with 256 neurons, and finally a softmax classifier. The grayscale model has a total of 455,780 parameters.

In [40]:
# Creating a Sequential model
model_2_grayscale = Sequential()
 
# Convolutional Block #1
model_2_grayscale.add(Conv2D(256, (2, 2), input_shape = (48, 48, 1), activation='relu', padding = 'same'))
model_2_grayscale.add(BatchNormalization())
model_2_grayscale.add(LeakyReLU(alpha = 0.1))
model_2_grayscale.add(MaxPooling2D(2, 2))

# Convolutional Block #2
model_2_grayscale.add(Conv2D(128, (2, 2), activation='relu', padding = 'same'))
model_2_grayscale.add(BatchNormalization())
model_2_grayscale.add(LeakyReLU(alpha = 0.1))
model_2_grayscale.add(MaxPooling2D(2, 2))

# Convolutional Block #3
model_2_grayscale.add(Conv2D(64, (2, 2), activation='relu', padding = 'same'))
model_2_grayscale.add(BatchNormalization())
model_2_grayscale.add(LeakyReLU(alpha = 0.1))
model_2_grayscale.add(MaxPooling2D(2, 2))

# Convolutional Block #4
model_2_grayscale.add(Conv2D(32, (2, 2), activation='relu', padding = 'same'))
model_2_grayscale.add(BatchNormalization())
model_2_grayscale.add(LeakyReLU(alpha = 0.1))
model_2_grayscale.add(MaxPooling2D(2, 2))

# Flatten layer
model_2_grayscale.add(Flatten())

# Dense layers
model_2_grayscale.add(Dense(512, activation = 'relu'))
model_2_grayscale.add(Dense(256, activation = 'relu'))

# Classifier
model_2_grayscale.add(Dense(4, activation = 'softmax'))

model_2_grayscale.summary()
Model: "sequential_2"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 conv2d_6 (Conv2D)           (None, 48, 48, 256)       1280      
                                                                 
 batch_normalization (BatchN  (None, 48, 48, 256)      1024      
 ormalization)                                                   
                                                                 
 leaky_re_lu (LeakyReLU)     (None, 48, 48, 256)       0         
                                                                 
 max_pooling2d_6 (MaxPooling  (None, 24, 24, 256)      0         
 2D)                                                             
                                                                 
 conv2d_7 (Conv2D)           (None, 24, 24, 128)       131200    
                                                                 
 batch_normalization_1 (Batc  (None, 24, 24, 128)      512       
 hNormalization)                                                 
                                                                 
 leaky_re_lu_1 (LeakyReLU)   (None, 24, 24, 128)       0         
                                                                 
 max_pooling2d_7 (MaxPooling  (None, 12, 12, 128)      0         
 2D)                                                             
                                                                 
 conv2d_8 (Conv2D)           (None, 12, 12, 64)        32832     
                                                                 
 batch_normalization_2 (Batc  (None, 12, 12, 64)       256       
 hNormalization)                                                 
                                                                 
 leaky_re_lu_2 (LeakyReLU)   (None, 12, 12, 64)        0         
                                                                 
 max_pooling2d_8 (MaxPooling  (None, 6, 6, 64)         0         
 2D)                                                             
                                                                 
 conv2d_9 (Conv2D)           (None, 6, 6, 32)          8224      
                                                                 
 batch_normalization_3 (Batc  (None, 6, 6, 32)         128       
 hNormalization)                                                 
                                                                 
 leaky_re_lu_3 (LeakyReLU)   (None, 6, 6, 32)          0         
                                                                 
 max_pooling2d_9 (MaxPooling  (None, 3, 3, 32)         0         
 2D)                                                             
                                                                 
 flatten_2 (Flatten)         (None, 288)               0         
                                                                 
 dense_4 (Dense)             (None, 512)               147968    
                                                                 
 dense_5 (Dense)             (None, 256)               131328    
                                                                 
 dense_6 (Dense)             (None, 4)                 1028      
                                                                 
=================================================================
Total params: 455,780
Trainable params: 454,820
Non-trainable params: 960
_________________________________________________________________

Compiling and Training the Model¶

In [41]:
# Creating a checkpoint which saves model weights from the best epoch
checkpoint = ModelCheckpoint("./model_2_grayscale.h5", monitor='val_accuracy', verbose=1, save_best_only=True, mode='auto')

# Initiates early stopping if validation loss does not continue to improve
early_stopping = EarlyStopping(monitor = 'val_loss',
                              min_delta = 0,
                              patience = 5,
                              verbose = 1,
                              restore_best_weights = True)

# Initiates reduced learning rate if validation loss does not continue to improve
reduce_learningrate = ReduceLROnPlateau(monitor = 'val_loss',
                                        factor = 0.2,
                                        patience = 3,
                                        verbose = 1,
                                        min_delta = 0.0001)

callbacks_list = [checkpoint, early_stopping, reduce_learningrate]
In [42]:
# Compiling model with optimizer set to Adam, loss set to categorical_crossentropy, and metrics set to accuracy
model_2_grayscale.compile(optimizer = Adam(learning_rate = 0.001), loss = 'categorical_crossentropy', metrics = ['accuracy'])
In [43]:
# Fitting model with epochs set to 100
history_2_grayscale = model_2_grayscale.fit(train_set_grayscale, validation_data = val_set_grayscale, epochs = 100, callbacks = callbacks_list)
Epoch 1/100
473/473 [==============================] - ETA: 0s - loss: 1.2684 - accuracy: 0.4053
Epoch 1: val_accuracy improved from -inf to 0.42134, saving model to ./model_2_grayscale.h5
473/473 [==============================] - 43s 89ms/step - loss: 1.2684 - accuracy: 0.4053 - val_loss: 1.2783 - val_accuracy: 0.4213 - lr: 0.0010
Epoch 2/100
473/473 [==============================] - ETA: 0s - loss: 1.0451 - accuracy: 0.5404
Epoch 2: val_accuracy improved from 0.42134 to 0.58368, saving model to ./model_2_grayscale.h5
473/473 [==============================] - 36s 75ms/step - loss: 1.0451 - accuracy: 0.5404 - val_loss: 0.9890 - val_accuracy: 0.5837 - lr: 0.0010
Epoch 3/100
473/473 [==============================] - ETA: 0s - loss: 0.9524 - accuracy: 0.5900
Epoch 3: val_accuracy did not improve from 0.58368
473/473 [==============================] - 37s 79ms/step - loss: 0.9524 - accuracy: 0.5900 - val_loss: 0.9803 - val_accuracy: 0.5694 - lr: 0.0010
Epoch 4/100
473/473 [==============================] - ETA: 0s - loss: 0.8941 - accuracy: 0.6146
Epoch 4: val_accuracy improved from 0.58368 to 0.61061, saving model to ./model_2_grayscale.h5
473/473 [==============================] - 44s 92ms/step - loss: 0.8941 - accuracy: 0.6146 - val_loss: 0.9132 - val_accuracy: 0.6106 - lr: 0.0010
Epoch 5/100
473/473 [==============================] - ETA: 0s - loss: 0.8503 - accuracy: 0.6401
Epoch 5: val_accuracy did not improve from 0.61061
473/473 [==============================] - 37s 79ms/step - loss: 0.8503 - accuracy: 0.6401 - val_loss: 0.9677 - val_accuracy: 0.5929 - lr: 0.0010
Epoch 6/100
473/473 [==============================] - ETA: 0s - loss: 0.8184 - accuracy: 0.6495
Epoch 6: val_accuracy improved from 0.61061 to 0.65863, saving model to ./model_2_grayscale.h5
473/473 [==============================] - 36s 76ms/step - loss: 0.8184 - accuracy: 0.6495 - val_loss: 0.8306 - val_accuracy: 0.6586 - lr: 0.0010
Epoch 7/100
473/473 [==============================] - ETA: 0s - loss: 0.7853 - accuracy: 0.6723
Epoch 7: val_accuracy did not improve from 0.65863
473/473 [==============================] - 36s 76ms/step - loss: 0.7853 - accuracy: 0.6723 - val_loss: 0.8979 - val_accuracy: 0.6225 - lr: 0.0010
Epoch 8/100
473/473 [==============================] - ETA: 0s - loss: 0.7632 - accuracy: 0.6789
Epoch 8: val_accuracy did not improve from 0.65863
473/473 [==============================] - 36s 77ms/step - loss: 0.7632 - accuracy: 0.6789 - val_loss: 0.9091 - val_accuracy: 0.6205 - lr: 0.0010
Epoch 9/100
473/473 [==============================] - ETA: 0s - loss: 0.7443 - accuracy: 0.6879
Epoch 9: val_accuracy did not improve from 0.65863
473/473 [==============================] - 42s 90ms/step - loss: 0.7443 - accuracy: 0.6879 - val_loss: 0.8299 - val_accuracy: 0.6540 - lr: 0.0010
Epoch 10/100
473/473 [==============================] - ETA: 0s - loss: 0.7266 - accuracy: 0.6940
Epoch 10: val_accuracy improved from 0.65863 to 0.67229, saving model to ./model_2_grayscale.h5
473/473 [==============================] - 41s 86ms/step - loss: 0.7266 - accuracy: 0.6940 - val_loss: 0.7992 - val_accuracy: 0.6723 - lr: 0.0010
Epoch 11/100
473/473 [==============================] - ETA: 0s - loss: 0.7089 - accuracy: 0.7046
Epoch 11: val_accuracy did not improve from 0.67229
473/473 [==============================] - 38s 79ms/step - loss: 0.7089 - accuracy: 0.7046 - val_loss: 0.8848 - val_accuracy: 0.6418 - lr: 0.0010
Epoch 12/100
473/473 [==============================] - ETA: 0s - loss: 0.6910 - accuracy: 0.7096
Epoch 12: val_accuracy improved from 0.67229 to 0.68194, saving model to ./model_2_grayscale.h5
473/473 [==============================] - 37s 78ms/step - loss: 0.6910 - accuracy: 0.7096 - val_loss: 0.7942 - val_accuracy: 0.6819 - lr: 0.0010
Epoch 13/100
473/473 [==============================] - ETA: 0s - loss: 0.6705 - accuracy: 0.7235
Epoch 13: val_accuracy did not improve from 0.68194
473/473 [==============================] - 36s 77ms/step - loss: 0.6705 - accuracy: 0.7235 - val_loss: 0.8160 - val_accuracy: 0.6815 - lr: 0.0010
Epoch 14/100
473/473 [==============================] - ETA: 0s - loss: 0.6581 - accuracy: 0.7286
Epoch 14: val_accuracy improved from 0.68194 to 0.68857, saving model to ./model_2_grayscale.h5
473/473 [==============================] - 38s 81ms/step - loss: 0.6581 - accuracy: 0.7286 - val_loss: 0.7747 - val_accuracy: 0.6886 - lr: 0.0010
Epoch 15/100
473/473 [==============================] - ETA: 0s - loss: 0.6475 - accuracy: 0.7255
Epoch 15: val_accuracy did not improve from 0.68857
473/473 [==============================] - 36s 77ms/step - loss: 0.6475 - accuracy: 0.7255 - val_loss: 0.7918 - val_accuracy: 0.6769 - lr: 0.0010
Epoch 16/100
473/473 [==============================] - ETA: 0s - loss: 0.6280 - accuracy: 0.7408
Epoch 16: val_accuracy did not improve from 0.68857
473/473 [==============================] - 37s 78ms/step - loss: 0.6280 - accuracy: 0.7408 - val_loss: 0.7942 - val_accuracy: 0.6701 - lr: 0.0010
Epoch 17/100
473/473 [==============================] - ETA: 0s - loss: 0.6140 - accuracy: 0.7495
Epoch 17: val_accuracy did not improve from 0.68857

Epoch 17: ReduceLROnPlateau reducing learning rate to 0.00020000000949949026.
473/473 [==============================] - 38s 80ms/step - loss: 0.6140 - accuracy: 0.7495 - val_loss: 0.7844 - val_accuracy: 0.6874 - lr: 0.0010
Epoch 18/100
473/473 [==============================] - ETA: 0s - loss: 0.5395 - accuracy: 0.7787
Epoch 18: val_accuracy improved from 0.68857 to 0.71469, saving model to ./model_2_grayscale.h5
473/473 [==============================] - 39s 82ms/step - loss: 0.5395 - accuracy: 0.7787 - val_loss: 0.7416 - val_accuracy: 0.7147 - lr: 2.0000e-04
Epoch 19/100
473/473 [==============================] - ETA: 0s - loss: 0.5027 - accuracy: 0.7936
Epoch 19: val_accuracy improved from 0.71469 to 0.72011, saving model to ./model_2_grayscale.h5
473/473 [==============================] - 37s 79ms/step - loss: 0.5027 - accuracy: 0.7936 - val_loss: 0.7523 - val_accuracy: 0.7201 - lr: 2.0000e-04
Epoch 20/100
473/473 [==============================] - ETA: 0s - loss: 0.4969 - accuracy: 0.7956
Epoch 20: val_accuracy did not improve from 0.72011
473/473 [==============================] - 36s 77ms/step - loss: 0.4969 - accuracy: 0.7956 - val_loss: 0.7552 - val_accuracy: 0.7141 - lr: 2.0000e-04
Epoch 21/100
473/473 [==============================] - ETA: 0s - loss: 0.4704 - accuracy: 0.8069
Epoch 21: val_accuracy did not improve from 0.72011

Epoch 21: ReduceLROnPlateau reducing learning rate to 4.0000001899898055e-05.
473/473 [==============================] - 36s 77ms/step - loss: 0.4704 - accuracy: 0.8069 - val_loss: 0.7567 - val_accuracy: 0.7195 - lr: 2.0000e-04
Epoch 22/100
473/473 [==============================] - ETA: 0s - loss: 0.4488 - accuracy: 0.8198
Epoch 22: val_accuracy did not improve from 0.72011
473/473 [==============================] - 37s 77ms/step - loss: 0.4488 - accuracy: 0.8198 - val_loss: 0.7835 - val_accuracy: 0.7125 - lr: 4.0000e-05
Epoch 23/100
473/473 [==============================] - ETA: 0s - loss: 0.4423 - accuracy: 0.8201
Epoch 23: val_accuracy did not improve from 0.72011
Restoring model weights from the end of the best epoch: 18.
473/473 [==============================] - 38s 80ms/step - loss: 0.4423 - accuracy: 0.8201 - val_loss: 0.7753 - val_accuracy: 0.7195 - lr: 4.0000e-05
Epoch 23: early stopping
In [44]:
# Plotting the accuracies

plt.figure(figsize = (10, 5))
plt.plot(history_2_grayscale.history['accuracy'])
plt.plot(history_2_grayscale.history['val_accuracy'])
plt.title('Accuracy - Model 2 (Grayscale)')
plt.ylabel('Accuracy')
plt.xlabel('Epoch')
plt.legend(['Training', 'Validation'], loc='lower right')
plt.show()
In [45]:
# Plotting the losses

plt.figure(figsize = (10, 5))
plt.plot(history_2_grayscale.history['loss'])
plt.plot(history_2_grayscale.history['val_loss'])
plt.title('Loss - Model 2 (Grayscale)')
plt.ylabel('loss')
plt.xlabel('epoch')
plt.legend(['Training', 'Validation'], loc='upper right')
plt.show()

Evaluating the Model on the Test Set¶

In [46]:
accuracy = model_2_grayscale.evaluate(test_set_grayscale)
4/4 [==============================] - 0s 29ms/step - loss: 0.8122 - accuracy: 0.6875

Observations and Insights:
As constructed, our second, deeper grayscale model performs somewhat differently than its predecessor. After 18 epochs (best epoch), training accuracy stands at 0.78 and validation accuracy is 0.71, which are both higher than Model 1, but Model 2 begins to overfit almost immediately, and the gaps between training and accuracy scores only grow from there. Training accuracy and loss continue to improve, while validation accuracy and loss begin to level off before early-stopping ends the training process. Accuracy on the test set is 0.69. Our model is not generalizing well, though with better accuracy scores compared to Model 1, there is an opportunity (if overfitting can be reduced) to become the better grayscale model.

Training Validation Test
Grayscale Accuracy 0.78 0.71 0.69


Model 2.2: 2nd Generation (RGB)¶

Note:
This model will contain the same architecture as the above grayscale model. Due to the input shape changing from 48,48,1 (grayscale) to 48,48,3 (rgb), the total parameters have increased to 457,828.

In [47]:
# Creating a Sequential model
model_2_rgb = Sequential()
 
# Convolutional Block #1
model_2_rgb.add(Conv2D(256, (2, 2), input_shape = (48, 48, 3), activation='relu', padding = 'same'))
model_2_rgb.add(BatchNormalization())
model_2_rgb.add(LeakyReLU(alpha = 0.1))
model_2_rgb.add(MaxPooling2D(2, 2))

# Convolutional Block #2
model_2_rgb.add(Conv2D(128, (2, 2), activation='relu', padding = 'same'))
model_2_rgb.add(BatchNormalization())
model_2_rgb.add(LeakyReLU(alpha = 0.1))
model_2_rgb.add(MaxPooling2D(2, 2))

# Convolutional Block #3
model_2_rgb.add(Conv2D(64, (2, 2), activation='relu', padding = 'same'))
model_2_rgb.add(BatchNormalization())
model_2_rgb.add(LeakyReLU(alpha = 0.1))
model_2_rgb.add(MaxPooling2D(2, 2))

# Convolutional Block #4
model_2_rgb.add(Conv2D(32, (2, 2), activation='relu', padding = 'same'))
model_2_rgb.add(BatchNormalization())
model_2_rgb.add(LeakyReLU(alpha = 0.1))
model_2_rgb.add(MaxPooling2D(2, 2))

# Flatten layer
model_2_rgb.add(Flatten())

# Dense layers
model_2_rgb.add(Dense(512, activation = 'relu'))
model_2_rgb.add(Dense(256, activation = 'relu'))

# Classifier
model_2_rgb.add(Dense(4, activation = 'softmax'))

model_2_rgb.summary()
Model: "sequential_3"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 conv2d_10 (Conv2D)          (None, 48, 48, 256)       3328      
                                                                 
 batch_normalization_4 (Batc  (None, 48, 48, 256)      1024      
 hNormalization)                                                 
                                                                 
 leaky_re_lu_4 (LeakyReLU)   (None, 48, 48, 256)       0         
                                                                 
 max_pooling2d_10 (MaxPoolin  (None, 24, 24, 256)      0         
 g2D)                                                            
                                                                 
 conv2d_11 (Conv2D)          (None, 24, 24, 128)       131200    
                                                                 
 batch_normalization_5 (Batc  (None, 24, 24, 128)      512       
 hNormalization)                                                 
                                                                 
 leaky_re_lu_5 (LeakyReLU)   (None, 24, 24, 128)       0         
                                                                 
 max_pooling2d_11 (MaxPoolin  (None, 12, 12, 128)      0         
 g2D)                                                            
                                                                 
 conv2d_12 (Conv2D)          (None, 12, 12, 64)        32832     
                                                                 
 batch_normalization_6 (Batc  (None, 12, 12, 64)       256       
 hNormalization)                                                 
                                                                 
 leaky_re_lu_6 (LeakyReLU)   (None, 12, 12, 64)        0         
                                                                 
 max_pooling2d_12 (MaxPoolin  (None, 6, 6, 64)         0         
 g2D)                                                            
                                                                 
 conv2d_13 (Conv2D)          (None, 6, 6, 32)          8224      
                                                                 
 batch_normalization_7 (Batc  (None, 6, 6, 32)         128       
 hNormalization)                                                 
                                                                 
 leaky_re_lu_7 (LeakyReLU)   (None, 6, 6, 32)          0         
                                                                 
 max_pooling2d_13 (MaxPoolin  (None, 3, 3, 32)         0         
 g2D)                                                            
                                                                 
 flatten_3 (Flatten)         (None, 288)               0         
                                                                 
 dense_7 (Dense)             (None, 512)               147968    
                                                                 
 dense_8 (Dense)             (None, 256)               131328    
                                                                 
 dense_9 (Dense)             (None, 4)                 1028      
                                                                 
=================================================================
Total params: 457,828
Trainable params: 456,868
Non-trainable params: 960
_________________________________________________________________

Compiling and Training the Model¶

In [48]:
# Creating a checkpoint which saves model weights from the best epoch
checkpoint = ModelCheckpoint("./model_2_rgb.h5", monitor='val_accuracy', verbose=1, save_best_only=True, mode='auto')

# Initiates early stopping if validation loss does not continue to improve
early_stopping = EarlyStopping(monitor = 'val_loss',
                              min_delta = 0,
                              patience = 5,
                              verbose = 1,
                              restore_best_weights = True)

# Initiates reduced learning rate if validation loss does not continue to improve
reduce_learningrate = ReduceLROnPlateau(monitor = 'val_loss',
                                        factor = 0.2,
                                        patience = 3,
                                        verbose = 1,
                                        min_delta = 0.0001)

callbacks_list = [checkpoint, early_stopping, reduce_learningrate]
In [49]:
# Compiling model with optimizer set to Adam, loss set to categorical_crossentropy, and metrics set to accuracy
model_2_rgb.compile(optimizer = Adam(learning_rate = 0.001), loss = 'categorical_crossentropy', metrics = ['accuracy'])
In [50]:
# Fitting model with epochs set to 100
history_2_rgb = model_2_rgb.fit(train_set_rgb, validation_data = val_set_rgb, epochs = 100, callbacks = callbacks_list)
Epoch 1/100
473/473 [==============================] - ETA: 0s - loss: 1.2630 - accuracy: 0.4033
Epoch 1: val_accuracy improved from -inf to 0.39140, saving model to ./model_2_rgb.h5
473/473 [==============================] - 48s 101ms/step - loss: 1.2630 - accuracy: 0.4033 - val_loss: 1.2748 - val_accuracy: 0.3914 - lr: 0.0010
Epoch 2/100
473/473 [==============================] - ETA: 0s - loss: 1.0839 - accuracy: 0.5184
Epoch 2: val_accuracy improved from 0.39140 to 0.48021, saving model to ./model_2_rgb.h5
473/473 [==============================] - 41s 87ms/step - loss: 1.0839 - accuracy: 0.5184 - val_loss: 1.1498 - val_accuracy: 0.4802 - lr: 0.0010
Epoch 3/100
473/473 [==============================] - ETA: 0s - loss: 0.9816 - accuracy: 0.5715
Epoch 3: val_accuracy improved from 0.48021 to 0.59936, saving model to ./model_2_rgb.h5
473/473 [==============================] - 41s 86ms/step - loss: 0.9816 - accuracy: 0.5715 - val_loss: 0.9470 - val_accuracy: 0.5994 - lr: 0.0010
Epoch 4/100
473/473 [==============================] - ETA: 0s - loss: 0.9106 - accuracy: 0.6063
Epoch 4: val_accuracy improved from 0.59936 to 0.61001, saving model to ./model_2_rgb.h5
473/473 [==============================] - 41s 86ms/step - loss: 0.9106 - accuracy: 0.6063 - val_loss: 0.9275 - val_accuracy: 0.6100 - lr: 0.0010
Epoch 5/100
473/473 [==============================] - ETA: 0s - loss: 0.8590 - accuracy: 0.6297
Epoch 5: val_accuracy did not improve from 0.61001
473/473 [==============================] - 41s 86ms/step - loss: 0.8590 - accuracy: 0.6297 - val_loss: 0.9990 - val_accuracy: 0.5604 - lr: 0.0010
Epoch 6/100
473/473 [==============================] - ETA: 0s - loss: 0.8193 - accuracy: 0.6492
Epoch 6: val_accuracy did not improve from 0.61001
473/473 [==============================] - 42s 89ms/step - loss: 0.8193 - accuracy: 0.6492 - val_loss: 0.9861 - val_accuracy: 0.5895 - lr: 0.0010
Epoch 7/100
473/473 [==============================] - ETA: 0s - loss: 0.7907 - accuracy: 0.6661
Epoch 7: val_accuracy improved from 0.61001 to 0.67129, saving model to ./model_2_rgb.h5
473/473 [==============================] - 42s 90ms/step - loss: 0.7907 - accuracy: 0.6661 - val_loss: 0.7824 - val_accuracy: 0.6713 - lr: 0.0010
Epoch 8/100
473/473 [==============================] - ETA: 0s - loss: 0.7698 - accuracy: 0.6754
Epoch 8: val_accuracy did not improve from 0.67129
473/473 [==============================] - 47s 99ms/step - loss: 0.7698 - accuracy: 0.6754 - val_loss: 0.8390 - val_accuracy: 0.6562 - lr: 0.0010
Epoch 9/100
473/473 [==============================] - ETA: 0s - loss: 0.7466 - accuracy: 0.6871
Epoch 9: val_accuracy did not improve from 0.67129
473/473 [==============================] - 41s 86ms/step - loss: 0.7466 - accuracy: 0.6871 - val_loss: 0.8137 - val_accuracy: 0.6624 - lr: 0.0010
Epoch 10/100
473/473 [==============================] - ETA: 0s - loss: 0.7287 - accuracy: 0.6936
Epoch 10: val_accuracy improved from 0.67129 to 0.67932, saving model to ./model_2_rgb.h5

Epoch 10: ReduceLROnPlateau reducing learning rate to 0.00020000000949949026.
473/473 [==============================] - 41s 87ms/step - loss: 0.7287 - accuracy: 0.6936 - val_loss: 0.7865 - val_accuracy: 0.6793 - lr: 0.0010
Epoch 11/100
473/473 [==============================] - ETA: 0s - loss: 0.6521 - accuracy: 0.7298
Epoch 11: val_accuracy improved from 0.67932 to 0.70384, saving model to ./model_2_rgb.h5
473/473 [==============================] - 42s 88ms/step - loss: 0.6521 - accuracy: 0.7298 - val_loss: 0.7338 - val_accuracy: 0.7038 - lr: 2.0000e-04
Epoch 12/100
473/473 [==============================] - ETA: 0s - loss: 0.6267 - accuracy: 0.7419
Epoch 12: val_accuracy improved from 0.70384 to 0.71187, saving model to ./model_2_rgb.h5
473/473 [==============================] - 41s 87ms/step - loss: 0.6267 - accuracy: 0.7419 - val_loss: 0.7317 - val_accuracy: 0.7119 - lr: 2.0000e-04
Epoch 13/100
473/473 [==============================] - ETA: 0s - loss: 0.6081 - accuracy: 0.7499
Epoch 13: val_accuracy did not improve from 0.71187
473/473 [==============================] - 41s 88ms/step - loss: 0.6081 - accuracy: 0.7499 - val_loss: 0.7514 - val_accuracy: 0.7042 - lr: 2.0000e-04
Epoch 14/100
473/473 [==============================] - ETA: 0s - loss: 0.6010 - accuracy: 0.7519
Epoch 14: val_accuracy did not improve from 0.71187
473/473 [==============================] - 42s 88ms/step - loss: 0.6010 - accuracy: 0.7519 - val_loss: 0.7564 - val_accuracy: 0.7014 - lr: 2.0000e-04
Epoch 15/100
473/473 [==============================] - ETA: 0s - loss: 0.5884 - accuracy: 0.7584
Epoch 15: val_accuracy improved from 0.71187 to 0.71308, saving model to ./model_2_rgb.h5
473/473 [==============================] - 46s 97ms/step - loss: 0.5884 - accuracy: 0.7584 - val_loss: 0.7226 - val_accuracy: 0.7131 - lr: 2.0000e-04
Epoch 16/100
473/473 [==============================] - ETA: 0s - loss: 0.5740 - accuracy: 0.7657
Epoch 16: val_accuracy did not improve from 0.71308
473/473 [==============================] - 41s 87ms/step - loss: 0.5740 - accuracy: 0.7657 - val_loss: 0.7616 - val_accuracy: 0.6988 - lr: 2.0000e-04
Epoch 17/100
473/473 [==============================] - ETA: 0s - loss: 0.5579 - accuracy: 0.7734
Epoch 17: val_accuracy improved from 0.71308 to 0.71650, saving model to ./model_2_rgb.h5
473/473 [==============================] - 41s 88ms/step - loss: 0.5579 - accuracy: 0.7734 - val_loss: 0.7491 - val_accuracy: 0.7165 - lr: 2.0000e-04
Epoch 18/100
473/473 [==============================] - ETA: 0s - loss: 0.5524 - accuracy: 0.7742
Epoch 18: val_accuracy did not improve from 0.71650

Epoch 18: ReduceLROnPlateau reducing learning rate to 4.0000001899898055e-05.
473/473 [==============================] - 42s 89ms/step - loss: 0.5524 - accuracy: 0.7742 - val_loss: 0.7648 - val_accuracy: 0.7054 - lr: 2.0000e-04
Epoch 19/100
473/473 [==============================] - ETA: 0s - loss: 0.5191 - accuracy: 0.7883
Epoch 19: val_accuracy did not improve from 0.71650
473/473 [==============================] - 41s 86ms/step - loss: 0.5191 - accuracy: 0.7883 - val_loss: 0.7552 - val_accuracy: 0.7115 - lr: 4.0000e-05
Epoch 20/100
473/473 [==============================] - ETA: 0s - loss: 0.5135 - accuracy: 0.7895
Epoch 20: val_accuracy did not improve from 0.71650
Restoring model weights from the end of the best epoch: 15.
473/473 [==============================] - 45s 95ms/step - loss: 0.5135 - accuracy: 0.7895 - val_loss: 0.7533 - val_accuracy: 0.7101 - lr: 4.0000e-05
Epoch 20: early stopping
In [51]:
# Plotting the accuracies

plt.figure(figsize = (10, 5))
plt.plot(history_2_rgb.history['accuracy'])
plt.plot(history_2_rgb.history['val_accuracy'])
plt.title('Accuracy - Model 2 (RGB)')
plt.ylabel('Accuracy')
plt.xlabel('Epoch')
plt.legend(['Training', 'Validation'], loc='lower right')
plt.show()
In [52]:
# Plotting the losses

plt.figure(figsize = (10, 5))
plt.plot(history_2_rgb.history['loss'])
plt.plot(history_2_rgb.history['val_loss'])
plt.title('Loss - Model 2 (RGB)')
plt.ylabel('loss')
plt.xlabel('epoch')
plt.legend(['Training', 'Validation'], loc='upper right')
plt.show()

Evaluating the Model on the Test Set¶

In [53]:
# Evaluating the model's performance on the test set
accuracy = model_2_rgb.evaluate(test_set_rgb)
4/4 [==============================] - 0s 41ms/step - loss: 0.6983 - accuracy: 0.6797

Observations and Insights:

As constructed, our second RGB model also performs somewhat differently than its predecessor. After 15 epochs (best epoch), training accuracy stands at 0.76 and validation accuracy is 0.71, which are both higher than Model 1, but Model 2 begins to overfit almost immediately. Training accuracy and loss continue to improve, while validation accuracy and loss level off before early-stopping ends the training process. Accuracy on the test set is 0.68. Once again, our model is not generalizing well, though with better accuracy scores compared to Model 1, there is an opportunity (if overfitting can be reduced) to become the better RGB model.

Our deeper grayscale and RGB models again perform similarly across all metrics, with the grayscale model attaining slightly better accuracies. Again, a slight edge to the grayscale model for performing better on the test set with a smaller number of trainable parameters.

Training Validation Test
Grayscale Accuracy 0.78 0.71 0.69
RGB Accuracy 0.76 0.71 0.68


In [54]:
# Plotting the accuracies

plt.figure(figsize = (10, 5))
plt.plot(history_2_grayscale.history['accuracy'])
plt.plot(history_2_grayscale.history['val_accuracy'])
plt.plot(history_2_rgb.history['accuracy'])
plt.plot(history_2_rgb.history['val_accuracy'])
plt.title('Accuracy - Model 2 (Grayscale & RGB)')
plt.ylabel('Accuracy')
plt.xlabel('Epoch')
plt.legend(['Training Accuracy (Grayscale)', 'Validation Accuracy (Grayscale)', 
            'Training Accuracy (RGB)', 'Validation Accuracy (RGB)'], loc='lower right')
plt.show()

Overall Observations and Insights on Initial Models:

  • As discussed above, both grayscale models slightly outperformed their RGB counterparts, and did so using less trainable parameters, making them less computationally expensive. Given this performance, we will proceed with grayscale models when doing so is possible.
  • As the datasets for this project are black and white images, it is possible that a grayscale colormode works better than a RGB colormode on what are essentially grayscale images. In this case, adding a second and third channel and increasing the input shape from 48,48,1 to 48,48,3 does not seem to help the modeling, and in fact may be making it overly complex.
  • As evidenced by the graph below, the 4 models thus far have fairly similar accuracy trajectories, though with a fair degree of separation between them. There is obviously room for improvement when it comes to overall accuracy. As early-stopping has prohibited us from seeing whether or not the training accuracy and loss level off before reaching 100%, it is clear that they continue to improve while validation accuracy and loss level off.
  • Some possible ways to decrease overfitting and thereby improve the above models include:
    • Introduce additional forms of data augmentation. While the above models take advantage of horizontal_flip, brightness_range, rescale, and sheer_range, it is possible that introducing additional forms of data augmentation (like width_shift_range, height_shift_range, zoon_range, rotation_range, etc. as discussed above) could help improve model performance.
    • Additional use BatchNormalization could also improve performance by offering some degree of regularization.
    • Additional use of DropOut and SpatialDropout could also help improve performance by assisting in regularization.
    • Introducting GaussianNoise could also assist in regularization, adding a form of noise to the data.
In [55]:
# Plotting the accuracies
plt.figure(figsize = (10, 5))
plt.plot(history_1_grayscale.history['accuracy'])
plt.plot(history_1_grayscale.history['val_accuracy'])
plt.plot(history_1_rgb.history['accuracy'])
plt.plot(history_1_rgb.history['val_accuracy'])
plt.plot(history_2_grayscale.history['accuracy'])
plt.plot(history_2_grayscale.history['val_accuracy'])
plt.plot(history_2_rgb.history['accuracy'])
plt.plot(history_2_rgb.history['val_accuracy'])
plt.title('Accuracy - Models 1 & 2 (Grayscale & RGB)' )
plt.ylabel('Accuracy')
plt.xlabel('Epoch')
plt.legend(['Training Accuracy - Model 1 (Grayscale)', 
            'Validation Accuracy - Model 1 (Grayscale)',
            'Training Accuracy  - Model 1 (RGB)', 
            'Validation Accuracy  - Model 1 (RGB)',
            'Training Accuracy - Model 2 (Grayscale)', 
            'Validation Accuracy - Model 2 (Grayscale)',
            'Training Accuracy - Model 2 (RGB)', 
            'Validation Accuracy - Model 2 (RGB)'], loc='lower right')
plt.show()

Transfer Learning Architectures¶

In this section, we will create several Transfer Learning architectures. For the pre-trained models, we will select three popular architectures, namely: VGG16, ResNet v2, and Efficient Net. The difference between these architectures and the previous architectures is that these will require 3 input channels (RGB) while the earlier models also worked on grayscale images.

Creating our Data Loaders for Transfer Learning Architectures¶

We will create new data loaders for the transfer learning architectures used below. As required by the architectures we will be piggybacking, color_mode will be set to RGB.

Additionally, we will be using the same data augmentation methods used on our previous models in order to better compare performance against our baseline models. These methods include horizontal_flip, brightness_range, rescale, and shear_range.

In [56]:
batch_size  = 32

# Creating ImageDataGenerator objects for RGB colormode
datagen_train_rgb = ImageDataGenerator(horizontal_flip = True, 
                                       brightness_range = (0.,2.),
                                       rescale = 1./255, 
                                       shear_range = 0.3)

datagen_validation_rgb = ImageDataGenerator(horizontal_flip = True, 
                                            brightness_range = (0.,2.),
                                            rescale = 1./255, 
                                            shear_range = 0.3)

datagen_test_rgb = ImageDataGenerator(horizontal_flip = True, 
                                      brightness_range = (0.,2.),
                                      rescale = 1./255, 
                                      shear_range = 0.3)


# Creating train, validation, and test sets for RGB colormode

print("\nColor Images")

train_set_rgb = datagen_train_rgb.flow_from_directory(dir_train,
                        target_size = (img_size, img_size),
                        color_mode = "rgb",
                        batch_size = batch_size,
                        class_mode = 'categorical',
                        classes = ['happy', 'sad', 'neutral', 'surprise'], 
                        seed = 42,  
                        shuffle = True)

val_set_rgb = datagen_validation_rgb.flow_from_directory(dir_validation,
                        target_size = (img_size, img_size),
                        color_mode = "rgb",
                        batch_size = batch_size,
                        class_mode = 'categorical',
                        classes = ['happy', 'sad', 'neutral', 'surprise'],
                        seed = 42,
                        shuffle = False)

test_set_rgb = datagen_test_rgb.flow_from_directory(dir_test,
                        target_size = (img_size, img_size),
                        color_mode = "rgb",
                        batch_size = batch_size,
                        class_mode = 'categorical',
                        classes = ['happy', 'sad', 'neutral', 'surprise'],
                        seed = 42,
                        shuffle = False)
Color Images
Found 15109 images belonging to 4 classes.
Found 4977 images belonging to 4 classes.
Found 128 images belonging to 4 classes.

Model 3: VGG16¶

First up is the VGG16 model, which is a CNN consisting of 13 convolutional layers, 5 MaxPooling layers, and 3 dense layers. The VGG16 model achieves nearly 93% accuracy on the ImageNet dataset containing 14 million images across 1,000 classes. Clearly, this is much more substantial than our models above.

Importing the VGG16 Architecture¶

In [57]:
vgg = VGG16(include_top = False, weights = 'imagenet', input_shape = (48, 48, 3))
vgg.summary()
Metal device set to: Apple M1 Pro
Model: "vgg16"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 input_1 (InputLayer)        [(None, 48, 48, 3)]       0         
                                                                 
 block1_conv1 (Conv2D)       (None, 48, 48, 64)        1792      
                                                                 
 block1_conv2 (Conv2D)       (None, 48, 48, 64)        36928     
                                                                 
 block1_pool (MaxPooling2D)  (None, 24, 24, 64)        0         
                                                                 
 block2_conv1 (Conv2D)       (None, 24, 24, 128)       73856     
                                                                 
 block2_conv2 (Conv2D)       (None, 24, 24, 128)       147584    
                                                                 
 block2_pool (MaxPooling2D)  (None, 12, 12, 128)       0         
                                                                 
 block3_conv1 (Conv2D)       (None, 12, 12, 256)       295168    
                                                                 
 block3_conv2 (Conv2D)       (None, 12, 12, 256)       590080    
                                                                 
 block3_conv3 (Conv2D)       (None, 12, 12, 256)       590080    
                                                                 
 block3_pool (MaxPooling2D)  (None, 6, 6, 256)         0         
                                                                 
 block4_conv1 (Conv2D)       (None, 6, 6, 512)         1180160   
                                                                 
 block4_conv2 (Conv2D)       (None, 6, 6, 512)         2359808   
                                                                 
 block4_conv3 (Conv2D)       (None, 6, 6, 512)         2359808   
                                                                 
 block4_pool (MaxPooling2D)  (None, 3, 3, 512)         0         
                                                                 
 block5_conv1 (Conv2D)       (None, 3, 3, 512)         2359808   
                                                                 
 block5_conv2 (Conv2D)       (None, 3, 3, 512)         2359808   
                                                                 
 block5_conv3 (Conv2D)       (None, 3, 3, 512)         2359808   
                                                                 
 block5_pool (MaxPooling2D)  (None, 1, 1, 512)         0         
                                                                 
=================================================================
Total params: 14,714,688
Trainable params: 14,714,688
Non-trainable params: 0
_________________________________________________________________

Model Building¶

We have imported the VGG16 model up to layer 'block4_pool', as this has shown the best performance compared to other layers (discussed below). The VGG16 layers will be frozen, so the only trainable layers will be those we add ourselves. After flattening the input from 'block4_pool', 2 dense layers will be added, followed by a Dropout layer, another dense layer, and BatchNormalization. We will end with a softmax classifier.

In [58]:
transfer_layer = vgg.get_layer('block4_pool')
vgg.trainable = False

# Flatten the input
x = Flatten()(transfer_layer.output)

# Dense layers
x = Dense(256, activation='relu')(x)
x = Dense(128, activation='relu')(x)
x = Dropout(0.2)(x)
x = Dense(64, activation='relu')(x)
x = BatchNormalization()(x)

# Classifier
pred = Dense(4, activation='softmax')(x)

# Initialize the model
model_3 = Model(vgg.input, pred)

Compiling and Training the VGG16 Model¶

In [59]:
# Creating a checkpoint which saves model weights from the best epoch
checkpoint = ModelCheckpoint('./model_3.h5', monitor='val_accuracy', verbose=1, save_best_only=True, mode='auto')

# Initiates early stopping if validation loss does not continue to improve
early_stopping = EarlyStopping(monitor = 'val_loss',
                              min_delta = 0,
                              patience = 15,     # This is increased compared to initial models, otherwise training is cut too quickly
                              verbose = 1,
                              restore_best_weights = True)

# Initiates reduced learning rate if validation loss does not continue to improve
reduce_learningrate = ReduceLROnPlateau(monitor = 'val_loss',
                                        factor = 0.2,
                                        patience = 3,
                                        verbose = 1,
                                        min_delta = 0.0001)

callbacks_list = [checkpoint, early_stopping, reduce_learningrate]
In [60]:
# Compiling model with optimizer set to Adam, loss set to categorical_crossentropy, and metrics set to accuracy
model_3.compile(optimizer = Adam(learning_rate = 0.001), loss = 'categorical_crossentropy', metrics = ['accuracy'])
In [61]:
# Fitting model with epochs set to 100
history_3 = model_3.fit(train_set_rgb, validation_data = val_set_rgb, epochs = 100, callbacks = callbacks_list)
Epoch 1/100
472/473 [============================>.] - ETA: 0s - loss: 1.1567 - accuracy: 0.4857
Epoch 1: val_accuracy improved from -inf to 0.58368, saving model to ./model_3.h5
473/473 [==============================] - 27s 56ms/step - loss: 1.1565 - accuracy: 0.4859 - val_loss: 0.9624 - val_accuracy: 0.5837 - lr: 0.0010
Epoch 2/100
472/473 [============================>.] - ETA: 0s - loss: 0.9959 - accuracy: 0.5696
Epoch 2: val_accuracy improved from 0.58368 to 0.60478, saving model to ./model_3.h5
473/473 [==============================] - 26s 55ms/step - loss: 0.9956 - accuracy: 0.5695 - val_loss: 0.9489 - val_accuracy: 0.6048 - lr: 0.0010
Epoch 3/100
473/473 [==============================] - ETA: 0s - loss: 0.9465 - accuracy: 0.5975
Epoch 3: val_accuracy did not improve from 0.60478
473/473 [==============================] - 25s 52ms/step - loss: 0.9465 - accuracy: 0.5975 - val_loss: 0.9521 - val_accuracy: 0.5823 - lr: 0.0010
Epoch 4/100
472/473 [============================>.] - ETA: 0s - loss: 0.9283 - accuracy: 0.6028
Epoch 4: val_accuracy improved from 0.60478 to 0.61583, saving model to ./model_3.h5
473/473 [==============================] - 24s 52ms/step - loss: 0.9283 - accuracy: 0.6029 - val_loss: 0.8993 - val_accuracy: 0.6158 - lr: 0.0010
Epoch 5/100
472/473 [============================>.] - ETA: 0s - loss: 0.8961 - accuracy: 0.6238
Epoch 5: val_accuracy did not improve from 0.61583
473/473 [==============================] - 25s 53ms/step - loss: 0.8965 - accuracy: 0.6237 - val_loss: 1.0021 - val_accuracy: 0.5773 - lr: 0.0010
Epoch 6/100
473/473 [==============================] - ETA: 0s - loss: 0.8771 - accuracy: 0.6313
Epoch 6: val_accuracy did not improve from 0.61583
473/473 [==============================] - 25s 53ms/step - loss: 0.8771 - accuracy: 0.6313 - val_loss: 0.9191 - val_accuracy: 0.6106 - lr: 0.0010
Epoch 7/100
473/473 [==============================] - ETA: 0s - loss: 0.8616 - accuracy: 0.6366
Epoch 7: val_accuracy did not improve from 0.61583

Epoch 7: ReduceLROnPlateau reducing learning rate to 0.00020000000949949026.
473/473 [==============================] - 25s 54ms/step - loss: 0.8616 - accuracy: 0.6366 - val_loss: 0.9665 - val_accuracy: 0.5881 - lr: 0.0010
Epoch 8/100
473/473 [==============================] - ETA: 0s - loss: 0.8029 - accuracy: 0.6680
Epoch 8: val_accuracy improved from 0.61583 to 0.64296, saving model to ./model_3.h5
473/473 [==============================] - 25s 52ms/step - loss: 0.8029 - accuracy: 0.6680 - val_loss: 0.8538 - val_accuracy: 0.6430 - lr: 2.0000e-04
Epoch 9/100
473/473 [==============================] - ETA: 0s - loss: 0.7846 - accuracy: 0.6760
Epoch 9: val_accuracy improved from 0.64296 to 0.64678, saving model to ./model_3.h5
473/473 [==============================] - 25s 53ms/step - loss: 0.7846 - accuracy: 0.6760 - val_loss: 0.8257 - val_accuracy: 0.6468 - lr: 2.0000e-04
Epoch 10/100
473/473 [==============================] - ETA: 0s - loss: 0.7748 - accuracy: 0.6801
Epoch 10: val_accuracy improved from 0.64678 to 0.65963, saving model to ./model_3.h5
473/473 [==============================] - 25s 53ms/step - loss: 0.7748 - accuracy: 0.6801 - val_loss: 0.8253 - val_accuracy: 0.6596 - lr: 2.0000e-04
Epoch 11/100
472/473 [============================>.] - ETA: 0s - loss: 0.7619 - accuracy: 0.6875
Epoch 11: val_accuracy did not improve from 0.65963
473/473 [==============================] - 25s 53ms/step - loss: 0.7625 - accuracy: 0.6875 - val_loss: 0.8434 - val_accuracy: 0.6482 - lr: 2.0000e-04
Epoch 12/100
472/473 [============================>.] - ETA: 0s - loss: 0.7511 - accuracy: 0.6946
Epoch 12: val_accuracy did not improve from 0.65963
473/473 [==============================] - 26s 56ms/step - loss: 0.7517 - accuracy: 0.6942 - val_loss: 0.8930 - val_accuracy: 0.6279 - lr: 2.0000e-04
Epoch 13/100
472/473 [============================>.] - ETA: 0s - loss: 0.7447 - accuracy: 0.6993
Epoch 13: val_accuracy did not improve from 0.65963

Epoch 13: ReduceLROnPlateau reducing learning rate to 4.0000001899898055e-05.
473/473 [==============================] - 25s 53ms/step - loss: 0.7449 - accuracy: 0.6992 - val_loss: 0.8513 - val_accuracy: 0.6446 - lr: 2.0000e-04
Epoch 14/100
473/473 [==============================] - ETA: 0s - loss: 0.7301 - accuracy: 0.7020
Epoch 14: val_accuracy improved from 0.65963 to 0.66325, saving model to ./model_3.h5
473/473 [==============================] - 25s 53ms/step - loss: 0.7301 - accuracy: 0.7020 - val_loss: 0.8147 - val_accuracy: 0.6633 - lr: 4.0000e-05
Epoch 15/100
472/473 [============================>.] - ETA: 0s - loss: 0.7166 - accuracy: 0.7120
Epoch 15: val_accuracy did not improve from 0.66325
473/473 [==============================] - 25s 53ms/step - loss: 0.7168 - accuracy: 0.7119 - val_loss: 0.8271 - val_accuracy: 0.6612 - lr: 4.0000e-05
Epoch 16/100
473/473 [==============================] - ETA: 0s - loss: 0.7163 - accuracy: 0.7128
Epoch 16: val_accuracy did not improve from 0.66325
473/473 [==============================] - 25s 54ms/step - loss: 0.7163 - accuracy: 0.7128 - val_loss: 0.8405 - val_accuracy: 0.6550 - lr: 4.0000e-05
Epoch 17/100
472/473 [============================>.] - ETA: 0s - loss: 0.7055 - accuracy: 0.7178
Epoch 17: val_accuracy improved from 0.66325 to 0.66727, saving model to ./model_3.h5

Epoch 17: ReduceLROnPlateau reducing learning rate to 8.000000525498762e-06.
473/473 [==============================] - 25s 52ms/step - loss: 0.7055 - accuracy: 0.7178 - val_loss: 0.8243 - val_accuracy: 0.6673 - lr: 4.0000e-05
Epoch 18/100
472/473 [============================>.] - ETA: 0s - loss: 0.7059 - accuracy: 0.7172
Epoch 18: val_accuracy did not improve from 0.66727
473/473 [==============================] - 25s 53ms/step - loss: 0.7059 - accuracy: 0.7173 - val_loss: 0.8273 - val_accuracy: 0.6641 - lr: 8.0000e-06
Epoch 19/100
473/473 [==============================] - ETA: 0s - loss: 0.7080 - accuracy: 0.7122
Epoch 19: val_accuracy did not improve from 0.66727
473/473 [==============================] - 25s 52ms/step - loss: 0.7080 - accuracy: 0.7122 - val_loss: 0.8163 - val_accuracy: 0.6641 - lr: 8.0000e-06
Epoch 20/100
472/473 [============================>.] - ETA: 0s - loss: 0.7017 - accuracy: 0.7190
Epoch 20: val_accuracy did not improve from 0.66727

Epoch 20: ReduceLROnPlateau reducing learning rate to 1.6000001778593287e-06.
473/473 [==============================] - 25s 52ms/step - loss: 0.7014 - accuracy: 0.7191 - val_loss: 0.8168 - val_accuracy: 0.6661 - lr: 8.0000e-06
Epoch 21/100
473/473 [==============================] - ETA: 0s - loss: 0.7071 - accuracy: 0.7149
Epoch 21: val_accuracy did not improve from 0.66727
473/473 [==============================] - 25s 54ms/step - loss: 0.7071 - accuracy: 0.7149 - val_loss: 0.8214 - val_accuracy: 0.6548 - lr: 1.6000e-06
Epoch 22/100
472/473 [============================>.] - ETA: 0s - loss: 0.7026 - accuracy: 0.7123
Epoch 22: val_accuracy did not improve from 0.66727
473/473 [==============================] - 25s 53ms/step - loss: 0.7029 - accuracy: 0.7122 - val_loss: 0.8246 - val_accuracy: 0.6612 - lr: 1.6000e-06
Epoch 23/100
472/473 [============================>.] - ETA: 0s - loss: 0.6989 - accuracy: 0.7184
Epoch 23: val_accuracy did not improve from 0.66727

Epoch 23: ReduceLROnPlateau reducing learning rate to 3.200000264769187e-07.
473/473 [==============================] - 24s 52ms/step - loss: 0.6986 - accuracy: 0.7186 - val_loss: 0.8369 - val_accuracy: 0.6604 - lr: 1.6000e-06
Epoch 24/100
473/473 [==============================] - ETA: 0s - loss: 0.7066 - accuracy: 0.7187
Epoch 24: val_accuracy did not improve from 0.66727
473/473 [==============================] - 25s 53ms/step - loss: 0.7066 - accuracy: 0.7187 - val_loss: 0.8291 - val_accuracy: 0.6526 - lr: 3.2000e-07
Epoch 25/100
473/473 [==============================] - ETA: 0s - loss: 0.7011 - accuracy: 0.7216
Epoch 25: val_accuracy did not improve from 0.66727
473/473 [==============================] - 26s 54ms/step - loss: 0.7011 - accuracy: 0.7216 - val_loss: 0.8222 - val_accuracy: 0.6624 - lr: 3.2000e-07
Epoch 26/100
473/473 [==============================] - ETA: 0s - loss: 0.6997 - accuracy: 0.7165
Epoch 26: val_accuracy did not improve from 0.66727
473/473 [==============================] - 25s 52ms/step - loss: 0.6997 - accuracy: 0.7165 - val_loss: 0.8136 - val_accuracy: 0.6641 - lr: 3.2000e-07
Epoch 27/100
472/473 [============================>.] - ETA: 0s - loss: 0.7075 - accuracy: 0.7138
Epoch 27: val_accuracy improved from 0.66727 to 0.66787, saving model to ./model_3.h5
473/473 [==============================] - 25s 52ms/step - loss: 0.7071 - accuracy: 0.7139 - val_loss: 0.8259 - val_accuracy: 0.6679 - lr: 3.2000e-07
Epoch 28/100
472/473 [============================>.] - ETA: 0s - loss: 0.7100 - accuracy: 0.7143
Epoch 28: val_accuracy did not improve from 0.66787
473/473 [==============================] - 25s 52ms/step - loss: 0.7100 - accuracy: 0.7142 - val_loss: 0.8182 - val_accuracy: 0.6659 - lr: 3.2000e-07
Epoch 29/100
472/473 [============================>.] - ETA: 0s - loss: 0.7110 - accuracy: 0.7148
Epoch 29: val_accuracy improved from 0.66787 to 0.66868, saving model to ./model_3.h5
473/473 [==============================] - 25s 54ms/step - loss: 0.7107 - accuracy: 0.7150 - val_loss: 0.8042 - val_accuracy: 0.6687 - lr: 3.2000e-07
Epoch 30/100
472/473 [============================>.] - ETA: 0s - loss: 0.7077 - accuracy: 0.7147
Epoch 30: val_accuracy did not improve from 0.66868
473/473 [==============================] - 25s 53ms/step - loss: 0.7076 - accuracy: 0.7147 - val_loss: 0.8247 - val_accuracy: 0.6628 - lr: 3.2000e-07
Epoch 31/100
473/473 [==============================] - ETA: 0s - loss: 0.7021 - accuracy: 0.7169
Epoch 31: val_accuracy did not improve from 0.66868
473/473 [==============================] - 25s 52ms/step - loss: 0.7021 - accuracy: 0.7169 - val_loss: 0.8272 - val_accuracy: 0.6626 - lr: 3.2000e-07
Epoch 32/100
473/473 [==============================] - ETA: 0s - loss: 0.7035 - accuracy: 0.7140
Epoch 32: val_accuracy did not improve from 0.66868

Epoch 32: ReduceLROnPlateau reducing learning rate to 6.400000529538374e-08.
473/473 [==============================] - 25s 53ms/step - loss: 0.7035 - accuracy: 0.7140 - val_loss: 0.8372 - val_accuracy: 0.6608 - lr: 3.2000e-07
Epoch 33/100
473/473 [==============================] - ETA: 0s - loss: 0.7164 - accuracy: 0.7110
Epoch 33: val_accuracy did not improve from 0.66868
473/473 [==============================] - 25s 52ms/step - loss: 0.7164 - accuracy: 0.7110 - val_loss: 0.8113 - val_accuracy: 0.6622 - lr: 6.4000e-08
Epoch 34/100
473/473 [==============================] - ETA: 0s - loss: 0.7049 - accuracy: 0.7217
Epoch 34: val_accuracy did not improve from 0.66868
473/473 [==============================] - 26s 56ms/step - loss: 0.7049 - accuracy: 0.7217 - val_loss: 0.8284 - val_accuracy: 0.6653 - lr: 6.4000e-08
Epoch 35/100
472/473 [============================>.] - ETA: 0s - loss: 0.6998 - accuracy: 0.7208
Epoch 35: val_accuracy did not improve from 0.66868

Epoch 35: ReduceLROnPlateau reducing learning rate to 1.2800001059076749e-08.
473/473 [==============================] - 25s 53ms/step - loss: 0.6995 - accuracy: 0.7211 - val_loss: 0.8243 - val_accuracy: 0.6624 - lr: 6.4000e-08
Epoch 36/100
473/473 [==============================] - ETA: 0s - loss: 0.7031 - accuracy: 0.7179
Epoch 36: val_accuracy did not improve from 0.66868
473/473 [==============================] - 25s 53ms/step - loss: 0.7031 - accuracy: 0.7179 - val_loss: 0.8346 - val_accuracy: 0.6552 - lr: 1.2800e-08
Epoch 37/100
473/473 [==============================] - ETA: 0s - loss: 0.7026 - accuracy: 0.7196
Epoch 37: val_accuracy did not improve from 0.66868
473/473 [==============================] - 25s 52ms/step - loss: 0.7026 - accuracy: 0.7196 - val_loss: 0.8184 - val_accuracy: 0.6641 - lr: 1.2800e-08
Epoch 38/100
472/473 [============================>.] - ETA: 0s - loss: 0.6993 - accuracy: 0.7222
Epoch 38: val_accuracy did not improve from 0.66868

Epoch 38: ReduceLROnPlateau reducing learning rate to 2.5600002118153498e-09.
473/473 [==============================] - 25s 54ms/step - loss: 0.6993 - accuracy: 0.7222 - val_loss: 0.8251 - val_accuracy: 0.6556 - lr: 1.2800e-08
Epoch 39/100
472/473 [============================>.] - ETA: 0s - loss: 0.7074 - accuracy: 0.7116
Epoch 39: val_accuracy did not improve from 0.66868
473/473 [==============================] - 24s 51ms/step - loss: 0.7083 - accuracy: 0.7112 - val_loss: 0.8227 - val_accuracy: 0.6576 - lr: 2.5600e-09
Epoch 40/100
472/473 [============================>.] - ETA: 0s - loss: 0.7031 - accuracy: 0.7163
Epoch 40: val_accuracy did not improve from 0.66868
473/473 [==============================] - 24s 51ms/step - loss: 0.7029 - accuracy: 0.7163 - val_loss: 0.8238 - val_accuracy: 0.6651 - lr: 2.5600e-09
Epoch 41/100
472/473 [============================>.] - ETA: 0s - loss: 0.6988 - accuracy: 0.7201
Epoch 41: val_accuracy did not improve from 0.66868

Epoch 41: ReduceLROnPlateau reducing learning rate to 5.1200004236307e-10.
473/473 [==============================] - 25s 53ms/step - loss: 0.6992 - accuracy: 0.7200 - val_loss: 0.8195 - val_accuracy: 0.6628 - lr: 2.5600e-09
Epoch 42/100
473/473 [==============================] - ETA: 0s - loss: 0.7109 - accuracy: 0.7165
Epoch 42: val_accuracy did not improve from 0.66868
473/473 [==============================] - 26s 55ms/step - loss: 0.7109 - accuracy: 0.7165 - val_loss: 0.8230 - val_accuracy: 0.6659 - lr: 5.1200e-10
Epoch 43/100
473/473 [==============================] - ETA: 0s - loss: 0.6996 - accuracy: 0.7215
Epoch 43: val_accuracy did not improve from 0.66868
473/473 [==============================] - 25s 53ms/step - loss: 0.6996 - accuracy: 0.7215 - val_loss: 0.8386 - val_accuracy: 0.6566 - lr: 5.1200e-10
Epoch 44/100
473/473 [==============================] - ETA: 0s - loss: 0.7026 - accuracy: 0.7174
Epoch 44: val_accuracy did not improve from 0.66868
Restoring model weights from the end of the best epoch: 29.

Epoch 44: ReduceLROnPlateau reducing learning rate to 1.0240001069306004e-10.
473/473 [==============================] - 25s 52ms/step - loss: 0.7026 - accuracy: 0.7174 - val_loss: 0.8289 - val_accuracy: 0.6626 - lr: 5.1200e-10
Epoch 44: early stopping
In [62]:
# Plotting the accuracies

plt.figure(figsize = (10, 5))
plt.plot(history_3.history['accuracy'])
plt.plot(history_3.history['val_accuracy'])
plt.title('Accuracy - VGG16 Model')
plt.ylabel('Accuracy')
plt.xlabel('Epoch')
plt.legend(['Training', 'Validation'], loc='lower right')
plt.show()
In [63]:
# Plotting the losses

plt.figure(figsize = (10, 5))
plt.plot(history_3.history['loss'])
plt.plot(history_3.history['val_loss'])
plt.title('Loss - VGG16 Model')
plt.ylabel('loss')
plt.xlabel('epoch')
plt.legend(['Training', 'Validation'], loc='upper right')
plt.show()

Evaluating the VGG16 model¶

In [64]:
# Evaluating the model's performance on the test set
accuracy = model_3.evaluate(test_set_rgb)
4/4 [==============================] - 0s 40ms/step - loss: 0.7386 - accuracy: 0.6562

Observations and Insights:
As imported and modified, our transfer learning model seems to perform similarly to our previous models developed above. After 29 epochs (best epoch), training accuracy stands at 0.72 and validation accuracy is 0.67. Accuracy and loss for both the training and validation data level off before early stopping ends the training. The model's performance on the test data stands at 0.66. These scores are roughly in line with the scores of Model 1, our baseline model.

The VGG16 model was ultimately imported up to layer block4_pool, as it produced the best performance. A history of alternative models is below.

Train Loss Train Accuracy Val Loss Val Accuracy
VGG16 block4_pool (selected) 0.71 0.72 0.80 0.67
VGG16 block5_pool 1.05 0.54 1.10 0.52
VGG16 block3_pool 0.79 0.69 0.77 0.66
VGG16 block2_pool 0.71 0.71 0.82 0.65


Model 4: ResNet v2¶

Our second transfer learning model is ResNet v2, which is a CNN trained on over 1 million images from the ImageNet database. ResNet v2 can classify images into 1,000 different categories. Like VGG16, colormode must be set to RGB to leverage this pre-trained architecture.

In [65]:
Resnet = ap.ResNet101(include_top = False, weights = "imagenet", input_shape=(48,48,3))
Resnet.summary()
Model: "resnet101"
__________________________________________________________________________________________________
 Layer (type)                   Output Shape         Param #     Connected to                     
==================================================================================================
 input_2 (InputLayer)           [(None, 48, 48, 3)]  0           []                               
                                                                                                  
 conv1_pad (ZeroPadding2D)      (None, 54, 54, 3)    0           ['input_2[0][0]']                
                                                                                                  
 conv1_conv (Conv2D)            (None, 24, 24, 64)   9472        ['conv1_pad[0][0]']              
                                                                                                  
 conv1_bn (BatchNormalization)  (None, 24, 24, 64)   256         ['conv1_conv[0][0]']             
                                                                                                  
 conv1_relu (Activation)        (None, 24, 24, 64)   0           ['conv1_bn[0][0]']               
                                                                                                  
 pool1_pad (ZeroPadding2D)      (None, 26, 26, 64)   0           ['conv1_relu[0][0]']             
                                                                                                  
 pool1_pool (MaxPooling2D)      (None, 12, 12, 64)   0           ['pool1_pad[0][0]']              
                                                                                                  
 conv2_block1_1_conv (Conv2D)   (None, 12, 12, 64)   4160        ['pool1_pool[0][0]']             
                                                                                                  
 conv2_block1_1_bn (BatchNormal  (None, 12, 12, 64)  256         ['conv2_block1_1_conv[0][0]']    
 ization)                                                                                         
                                                                                                  
 conv2_block1_1_relu (Activatio  (None, 12, 12, 64)  0           ['conv2_block1_1_bn[0][0]']      
 n)                                                                                               
                                                                                                  
 conv2_block1_2_conv (Conv2D)   (None, 12, 12, 64)   36928       ['conv2_block1_1_relu[0][0]']    
                                                                                                  
 conv2_block1_2_bn (BatchNormal  (None, 12, 12, 64)  256         ['conv2_block1_2_conv[0][0]']    
 ization)                                                                                         
                                                                                                  
 conv2_block1_2_relu (Activatio  (None, 12, 12, 64)  0           ['conv2_block1_2_bn[0][0]']      
 n)                                                                                               
                                                                                                  
 conv2_block1_0_conv (Conv2D)   (None, 12, 12, 256)  16640       ['pool1_pool[0][0]']             
                                                                                                  
 conv2_block1_3_conv (Conv2D)   (None, 12, 12, 256)  16640       ['conv2_block1_2_relu[0][0]']    
                                                                                                  
 conv2_block1_0_bn (BatchNormal  (None, 12, 12, 256)  1024       ['conv2_block1_0_conv[0][0]']    
 ization)                                                                                         
                                                                                                  
 conv2_block1_3_bn (BatchNormal  (None, 12, 12, 256)  1024       ['conv2_block1_3_conv[0][0]']    
 ization)                                                                                         
                                                                                                  
 conv2_block1_add (Add)         (None, 12, 12, 256)  0           ['conv2_block1_0_bn[0][0]',      
                                                                  'conv2_block1_3_bn[0][0]']      
                                                                                                  
 conv2_block1_out (Activation)  (None, 12, 12, 256)  0           ['conv2_block1_add[0][0]']       
                                                                                                  
 conv2_block2_1_conv (Conv2D)   (None, 12, 12, 64)   16448       ['conv2_block1_out[0][0]']       
                                                                                                  
 conv2_block2_1_bn (BatchNormal  (None, 12, 12, 64)  256         ['conv2_block2_1_conv[0][0]']    
 ization)                                                                                         
                                                                                                  
 conv2_block2_1_relu (Activatio  (None, 12, 12, 64)  0           ['conv2_block2_1_bn[0][0]']      
 n)                                                                                               
                                                                                                  
 conv2_block2_2_conv (Conv2D)   (None, 12, 12, 64)   36928       ['conv2_block2_1_relu[0][0]']    
                                                                                                  
 conv2_block2_2_bn (BatchNormal  (None, 12, 12, 64)  256         ['conv2_block2_2_conv[0][0]']    
 ization)                                                                                         
                                                                                                  
 conv2_block2_2_relu (Activatio  (None, 12, 12, 64)  0           ['conv2_block2_2_bn[0][0]']      
 n)                                                                                               
                                                                                                  
 conv2_block2_3_conv (Conv2D)   (None, 12, 12, 256)  16640       ['conv2_block2_2_relu[0][0]']    
                                                                                                  
 conv2_block2_3_bn (BatchNormal  (None, 12, 12, 256)  1024       ['conv2_block2_3_conv[0][0]']    
 ization)                                                                                         
                                                                                                  
 conv2_block2_add (Add)         (None, 12, 12, 256)  0           ['conv2_block1_out[0][0]',       
                                                                  'conv2_block2_3_bn[0][0]']      
                                                                                                  
 conv2_block2_out (Activation)  (None, 12, 12, 256)  0           ['conv2_block2_add[0][0]']       
                                                                                                  
 conv2_block3_1_conv (Conv2D)   (None, 12, 12, 64)   16448       ['conv2_block2_out[0][0]']       
                                                                                                  
 conv2_block3_1_bn (BatchNormal  (None, 12, 12, 64)  256         ['conv2_block3_1_conv[0][0]']    
 ization)                                                                                         
                                                                                                  
 conv2_block3_1_relu (Activatio  (None, 12, 12, 64)  0           ['conv2_block3_1_bn[0][0]']      
 n)                                                                                               
                                                                                                  
 conv2_block3_2_conv (Conv2D)   (None, 12, 12, 64)   36928       ['conv2_block3_1_relu[0][0]']    
                                                                                                  
 conv2_block3_2_bn (BatchNormal  (None, 12, 12, 64)  256         ['conv2_block3_2_conv[0][0]']    
 ization)                                                                                         
                                                                                                  
 conv2_block3_2_relu (Activatio  (None, 12, 12, 64)  0           ['conv2_block3_2_bn[0][0]']      
 n)                                                                                               
                                                                                                  
 conv2_block3_3_conv (Conv2D)   (None, 12, 12, 256)  16640       ['conv2_block3_2_relu[0][0]']    
                                                                                                  
 conv2_block3_3_bn (BatchNormal  (None, 12, 12, 256)  1024       ['conv2_block3_3_conv[0][0]']    
 ization)                                                                                         
                                                                                                  
 conv2_block3_add (Add)         (None, 12, 12, 256)  0           ['conv2_block2_out[0][0]',       
                                                                  'conv2_block3_3_bn[0][0]']      
                                                                                                  
 conv2_block3_out (Activation)  (None, 12, 12, 256)  0           ['conv2_block3_add[0][0]']       
                                                                                                  
 conv3_block1_1_conv (Conv2D)   (None, 6, 6, 128)    32896       ['conv2_block3_out[0][0]']       
                                                                                                  
 conv3_block1_1_bn (BatchNormal  (None, 6, 6, 128)   512         ['conv3_block1_1_conv[0][0]']    
 ization)                                                                                         
                                                                                                  
 conv3_block1_1_relu (Activatio  (None, 6, 6, 128)   0           ['conv3_block1_1_bn[0][0]']      
 n)                                                                                               
                                                                                                  
 conv3_block1_2_conv (Conv2D)   (None, 6, 6, 128)    147584      ['conv3_block1_1_relu[0][0]']    
                                                                                                  
 conv3_block1_2_bn (BatchNormal  (None, 6, 6, 128)   512         ['conv3_block1_2_conv[0][0]']    
 ization)                                                                                         
                                                                                                  
 conv3_block1_2_relu (Activatio  (None, 6, 6, 128)   0           ['conv3_block1_2_bn[0][0]']      
 n)                                                                                               
                                                                                                  
 conv3_block1_0_conv (Conv2D)   (None, 6, 6, 512)    131584      ['conv2_block3_out[0][0]']       
                                                                                                  
 conv3_block1_3_conv (Conv2D)   (None, 6, 6, 512)    66048       ['conv3_block1_2_relu[0][0]']    
                                                                                                  
 conv3_block1_0_bn (BatchNormal  (None, 6, 6, 512)   2048        ['conv3_block1_0_conv[0][0]']    
 ization)                                                                                         
                                                                                                  
 conv3_block1_3_bn (BatchNormal  (None, 6, 6, 512)   2048        ['conv3_block1_3_conv[0][0]']    
 ization)                                                                                         
                                                                                                  
 conv3_block1_add (Add)         (None, 6, 6, 512)    0           ['conv3_block1_0_bn[0][0]',      
                                                                  'conv3_block1_3_bn[0][0]']      
                                                                                                  
 conv3_block1_out (Activation)  (None, 6, 6, 512)    0           ['conv3_block1_add[0][0]']       
                                                                                                  
 conv3_block2_1_conv (Conv2D)   (None, 6, 6, 128)    65664       ['conv3_block1_out[0][0]']       
                                                                                                  
 conv3_block2_1_bn (BatchNormal  (None, 6, 6, 128)   512         ['conv3_block2_1_conv[0][0]']    
 ization)                                                                                         
                                                                                                  
 conv3_block2_1_relu (Activatio  (None, 6, 6, 128)   0           ['conv3_block2_1_bn[0][0]']      
 n)                                                                                               
                                                                                                  
 conv3_block2_2_conv (Conv2D)   (None, 6, 6, 128)    147584      ['conv3_block2_1_relu[0][0]']    
                                                                                                  
 conv3_block2_2_bn (BatchNormal  (None, 6, 6, 128)   512         ['conv3_block2_2_conv[0][0]']    
 ization)                                                                                         
                                                                                                  
 conv3_block2_2_relu (Activatio  (None, 6, 6, 128)   0           ['conv3_block2_2_bn[0][0]']      
 n)                                                                                               
                                                                                                  
 conv3_block2_3_conv (Conv2D)   (None, 6, 6, 512)    66048       ['conv3_block2_2_relu[0][0]']    
                                                                                                  
 conv3_block2_3_bn (BatchNormal  (None, 6, 6, 512)   2048        ['conv3_block2_3_conv[0][0]']    
 ization)                                                                                         
                                                                                                  
 conv3_block2_add (Add)         (None, 6, 6, 512)    0           ['conv3_block1_out[0][0]',       
                                                                  'conv3_block2_3_bn[0][0]']      
                                                                                                  
 conv3_block2_out (Activation)  (None, 6, 6, 512)    0           ['conv3_block2_add[0][0]']       
                                                                                                  
 conv3_block3_1_conv (Conv2D)   (None, 6, 6, 128)    65664       ['conv3_block2_out[0][0]']       
                                                                                                  
 conv3_block3_1_bn (BatchNormal  (None, 6, 6, 128)   512         ['conv3_block3_1_conv[0][0]']    
 ization)                                                                                         
                                                                                                  
 conv3_block3_1_relu (Activatio  (None, 6, 6, 128)   0           ['conv3_block3_1_bn[0][0]']      
 n)                                                                                               
                                                                                                  
 conv3_block3_2_conv (Conv2D)   (None, 6, 6, 128)    147584      ['conv3_block3_1_relu[0][0]']    
                                                                                                  
 conv3_block3_2_bn (BatchNormal  (None, 6, 6, 128)   512         ['conv3_block3_2_conv[0][0]']    
 ization)                                                                                         
                                                                                                  
 conv3_block3_2_relu (Activatio  (None, 6, 6, 128)   0           ['conv3_block3_2_bn[0][0]']      
 n)                                                                                               
                                                                                                  
 conv3_block3_3_conv (Conv2D)   (None, 6, 6, 512)    66048       ['conv3_block3_2_relu[0][0]']    
                                                                                                  
 conv3_block3_3_bn (BatchNormal  (None, 6, 6, 512)   2048        ['conv3_block3_3_conv[0][0]']    
 ization)                                                                                         
                                                                                                  
 conv3_block3_add (Add)         (None, 6, 6, 512)    0           ['conv3_block2_out[0][0]',       
                                                                  'conv3_block3_3_bn[0][0]']      
                                                                                                  
 conv3_block3_out (Activation)  (None, 6, 6, 512)    0           ['conv3_block3_add[0][0]']       
                                                                                                  
 conv3_block4_1_conv (Conv2D)   (None, 6, 6, 128)    65664       ['conv3_block3_out[0][0]']       
                                                                                                  
 conv3_block4_1_bn (BatchNormal  (None, 6, 6, 128)   512         ['conv3_block4_1_conv[0][0]']    
 ization)                                                                                         
                                                                                                  
 conv3_block4_1_relu (Activatio  (None, 6, 6, 128)   0           ['conv3_block4_1_bn[0][0]']      
 n)                                                                                               
                                                                                                  
 conv3_block4_2_conv (Conv2D)   (None, 6, 6, 128)    147584      ['conv3_block4_1_relu[0][0]']    
                                                                                                  
 conv3_block4_2_bn (BatchNormal  (None, 6, 6, 128)   512         ['conv3_block4_2_conv[0][0]']    
 ization)                                                                                         
                                                                                                  
 conv3_block4_2_relu (Activatio  (None, 6, 6, 128)   0           ['conv3_block4_2_bn[0][0]']      
 n)                                                                                               
                                                                                                  
 conv3_block4_3_conv (Conv2D)   (None, 6, 6, 512)    66048       ['conv3_block4_2_relu[0][0]']    
                                                                                                  
 conv3_block4_3_bn (BatchNormal  (None, 6, 6, 512)   2048        ['conv3_block4_3_conv[0][0]']    
 ization)                                                                                         
                                                                                                  
 conv3_block4_add (Add)         (None, 6, 6, 512)    0           ['conv3_block3_out[0][0]',       
                                                                  'conv3_block4_3_bn[0][0]']      
                                                                                                  
 conv3_block4_out (Activation)  (None, 6, 6, 512)    0           ['conv3_block4_add[0][0]']       
                                                                                                  
 conv4_block1_1_conv (Conv2D)   (None, 3, 3, 256)    131328      ['conv3_block4_out[0][0]']       
                                                                                                  
 conv4_block1_1_bn (BatchNormal  (None, 3, 3, 256)   1024        ['conv4_block1_1_conv[0][0]']    
 ization)                                                                                         
                                                                                                  
 conv4_block1_1_relu (Activatio  (None, 3, 3, 256)   0           ['conv4_block1_1_bn[0][0]']      
 n)                                                                                               
                                                                                                  
 conv4_block1_2_conv (Conv2D)   (None, 3, 3, 256)    590080      ['conv4_block1_1_relu[0][0]']    
                                                                                                  
 conv4_block1_2_bn (BatchNormal  (None, 3, 3, 256)   1024        ['conv4_block1_2_conv[0][0]']    
 ization)                                                                                         
                                                                                                  
 conv4_block1_2_relu (Activatio  (None, 3, 3, 256)   0           ['conv4_block1_2_bn[0][0]']      
 n)                                                                                               
                                                                                                  
 conv4_block1_0_conv (Conv2D)   (None, 3, 3, 1024)   525312      ['conv3_block4_out[0][0]']       
                                                                                                  
 conv4_block1_3_conv (Conv2D)   (None, 3, 3, 1024)   263168      ['conv4_block1_2_relu[0][0]']    
                                                                                                  
 conv4_block1_0_bn (BatchNormal  (None, 3, 3, 1024)  4096        ['conv4_block1_0_conv[0][0]']    
 ization)                                                                                         
                                                                                                  
 conv4_block1_3_bn (BatchNormal  (None, 3, 3, 1024)  4096        ['conv4_block1_3_conv[0][0]']    
 ization)                                                                                         
                                                                                                  
 conv4_block1_add (Add)         (None, 3, 3, 1024)   0           ['conv4_block1_0_bn[0][0]',      
                                                                  'conv4_block1_3_bn[0][0]']      
                                                                                                  
 conv4_block1_out (Activation)  (None, 3, 3, 1024)   0           ['conv4_block1_add[0][0]']       
                                                                                                  
 conv4_block2_1_conv (Conv2D)   (None, 3, 3, 256)    262400      ['conv4_block1_out[0][0]']       
                                                                                                  
 conv4_block2_1_bn (BatchNormal  (None, 3, 3, 256)   1024        ['conv4_block2_1_conv[0][0]']    
 ization)                                                                                         
                                                                                                  
 conv4_block2_1_relu (Activatio  (None, 3, 3, 256)   0           ['conv4_block2_1_bn[0][0]']      
 n)                                                                                               
                                                                                                  
 conv4_block2_2_conv (Conv2D)   (None, 3, 3, 256)    590080      ['conv4_block2_1_relu[0][0]']    
                                                                                                  
 conv4_block2_2_bn (BatchNormal  (None, 3, 3, 256)   1024        ['conv4_block2_2_conv[0][0]']    
 ization)                                                                                         
                                                                                                  
 conv4_block2_2_relu (Activatio  (None, 3, 3, 256)   0           ['conv4_block2_2_bn[0][0]']      
 n)                                                                                               
                                                                                                  
 conv4_block2_3_conv (Conv2D)   (None, 3, 3, 1024)   263168      ['conv4_block2_2_relu[0][0]']    
                                                                                                  
 conv4_block2_3_bn (BatchNormal  (None, 3, 3, 1024)  4096        ['conv4_block2_3_conv[0][0]']    
 ization)                                                                                         
                                                                                                  
 conv4_block2_add (Add)         (None, 3, 3, 1024)   0           ['conv4_block1_out[0][0]',       
                                                                  'conv4_block2_3_bn[0][0]']      
                                                                                                  
 conv4_block2_out (Activation)  (None, 3, 3, 1024)   0           ['conv4_block2_add[0][0]']       
                                                                                                  
 conv4_block3_1_conv (Conv2D)   (None, 3, 3, 256)    262400      ['conv4_block2_out[0][0]']       
                                                                                                  
 conv4_block3_1_bn (BatchNormal  (None, 3, 3, 256)   1024        ['conv4_block3_1_conv[0][0]']    
 ization)                                                                                         
                                                                                                  
 conv4_block3_1_relu (Activatio  (None, 3, 3, 256)   0           ['conv4_block3_1_bn[0][0]']      
 n)                                                                                               
                                                                                                  
 conv4_block3_2_conv (Conv2D)   (None, 3, 3, 256)    590080      ['conv4_block3_1_relu[0][0]']    
                                                                                                  
 conv4_block3_2_bn (BatchNormal  (None, 3, 3, 256)   1024        ['conv4_block3_2_conv[0][0]']    
 ization)                                                                                         
                                                                                                  
 conv4_block3_2_relu (Activatio  (None, 3, 3, 256)   0           ['conv4_block3_2_bn[0][0]']      
 n)                                                                                               
                                                                                                  
 conv4_block3_3_conv (Conv2D)   (None, 3, 3, 1024)   263168      ['conv4_block3_2_relu[0][0]']    
                                                                                                  
 conv4_block3_3_bn (BatchNormal  (None, 3, 3, 1024)  4096        ['conv4_block3_3_conv[0][0]']    
 ization)                                                                                         
                                                                                                  
 conv4_block3_add (Add)         (None, 3, 3, 1024)   0           ['conv4_block2_out[0][0]',       
                                                                  'conv4_block3_3_bn[0][0]']      
                                                                                                  
 conv4_block3_out (Activation)  (None, 3, 3, 1024)   0           ['conv4_block3_add[0][0]']       
                                                                                                  
 conv4_block4_1_conv (Conv2D)   (None, 3, 3, 256)    262400      ['conv4_block3_out[0][0]']       
                                                                                                  
 conv4_block4_1_bn (BatchNormal  (None, 3, 3, 256)   1024        ['conv4_block4_1_conv[0][0]']    
 ization)                                                                                         
                                                                                                  
 conv4_block4_1_relu (Activatio  (None, 3, 3, 256)   0           ['conv4_block4_1_bn[0][0]']      
 n)                                                                                               
                                                                                                  
 conv4_block4_2_conv (Conv2D)   (None, 3, 3, 256)    590080      ['conv4_block4_1_relu[0][0]']    
                                                                                                  
 conv4_block4_2_bn (BatchNormal  (None, 3, 3, 256)   1024        ['conv4_block4_2_conv[0][0]']    
 ization)                                                                                         
                                                                                                  
 conv4_block4_2_relu (Activatio  (None, 3, 3, 256)   0           ['conv4_block4_2_bn[0][0]']      
 n)                                                                                               
                                                                                                  
 conv4_block4_3_conv (Conv2D)   (None, 3, 3, 1024)   263168      ['conv4_block4_2_relu[0][0]']    
                                                                                                  
 conv4_block4_3_bn (BatchNormal  (None, 3, 3, 1024)  4096        ['conv4_block4_3_conv[0][0]']    
 ization)                                                                                         
                                                                                                  
 conv4_block4_add (Add)         (None, 3, 3, 1024)   0           ['conv4_block3_out[0][0]',       
                                                                  'conv4_block4_3_bn[0][0]']      
                                                                                                  
 conv4_block4_out (Activation)  (None, 3, 3, 1024)   0           ['conv4_block4_add[0][0]']       
                                                                                                  
 conv4_block5_1_conv (Conv2D)   (None, 3, 3, 256)    262400      ['conv4_block4_out[0][0]']       
                                                                                                  
 conv4_block5_1_bn (BatchNormal  (None, 3, 3, 256)   1024        ['conv4_block5_1_conv[0][0]']    
 ization)                                                                                         
                                                                                                  
 conv4_block5_1_relu (Activatio  (None, 3, 3, 256)   0           ['conv4_block5_1_bn[0][0]']      
 n)                                                                                               
                                                                                                  
 conv4_block5_2_conv (Conv2D)   (None, 3, 3, 256)    590080      ['conv4_block5_1_relu[0][0]']    
                                                                                                  
 conv4_block5_2_bn (BatchNormal  (None, 3, 3, 256)   1024        ['conv4_block5_2_conv[0][0]']    
 ization)                                                                                         
                                                                                                  
 conv4_block5_2_relu (Activatio  (None, 3, 3, 256)   0           ['conv4_block5_2_bn[0][0]']      
 n)                                                                                               
                                                                                                  
 conv4_block5_3_conv (Conv2D)   (None, 3, 3, 1024)   263168      ['conv4_block5_2_relu[0][0]']    
                                                                                                  
 conv4_block5_3_bn (BatchNormal  (None, 3, 3, 1024)  4096        ['conv4_block5_3_conv[0][0]']    
 ization)                                                                                         
                                                                                                  
 conv4_block5_add (Add)         (None, 3, 3, 1024)   0           ['conv4_block4_out[0][0]',       
                                                                  'conv4_block5_3_bn[0][0]']      
                                                                                                  
 conv4_block5_out (Activation)  (None, 3, 3, 1024)   0           ['conv4_block5_add[0][0]']       
                                                                                                  
 conv4_block6_1_conv (Conv2D)   (None, 3, 3, 256)    262400      ['conv4_block5_out[0][0]']       
                                                                                                  
 conv4_block6_1_bn (BatchNormal  (None, 3, 3, 256)   1024        ['conv4_block6_1_conv[0][0]']    
 ization)                                                                                         
                                                                                                  
 conv4_block6_1_relu (Activatio  (None, 3, 3, 256)   0           ['conv4_block6_1_bn[0][0]']      
 n)                                                                                               
                                                                                                  
 conv4_block6_2_conv (Conv2D)   (None, 3, 3, 256)    590080      ['conv4_block6_1_relu[0][0]']    
                                                                                                  
 conv4_block6_2_bn (BatchNormal  (None, 3, 3, 256)   1024        ['conv4_block6_2_conv[0][0]']    
 ization)                                                                                         
                                                                                                  
 conv4_block6_2_relu (Activatio  (None, 3, 3, 256)   0           ['conv4_block6_2_bn[0][0]']      
 n)                                                                                               
                                                                                                  
 conv4_block6_3_conv (Conv2D)   (None, 3, 3, 1024)   263168      ['conv4_block6_2_relu[0][0]']    
                                                                                                  
 conv4_block6_3_bn (BatchNormal  (None, 3, 3, 1024)  4096        ['conv4_block6_3_conv[0][0]']    
 ization)                                                                                         
                                                                                                  
 conv4_block6_add (Add)         (None, 3, 3, 1024)   0           ['conv4_block5_out[0][0]',       
                                                                  'conv4_block6_3_bn[0][0]']      
                                                                                                  
 conv4_block6_out (Activation)  (None, 3, 3, 1024)   0           ['conv4_block6_add[0][0]']       
                                                                                                  
 conv4_block7_1_conv (Conv2D)   (None, 3, 3, 256)    262400      ['conv4_block6_out[0][0]']       
                                                                                                  
 conv4_block7_1_bn (BatchNormal  (None, 3, 3, 256)   1024        ['conv4_block7_1_conv[0][0]']    
 ization)                                                                                         
                                                                                                  
 conv4_block7_1_relu (Activatio  (None, 3, 3, 256)   0           ['conv4_block7_1_bn[0][0]']      
 n)                                                                                               
                                                                                                  
 conv4_block7_2_conv (Conv2D)   (None, 3, 3, 256)    590080      ['conv4_block7_1_relu[0][0]']    
                                                                                                  
 conv4_block7_2_bn (BatchNormal  (None, 3, 3, 256)   1024        ['conv4_block7_2_conv[0][0]']    
 ization)                                                                                         
                                                                                                  
 conv4_block7_2_relu (Activatio  (None, 3, 3, 256)   0           ['conv4_block7_2_bn[0][0]']      
 n)                                                                                               
                                                                                                  
 conv4_block7_3_conv (Conv2D)   (None, 3, 3, 1024)   263168      ['conv4_block7_2_relu[0][0]']    
                                                                                                  
 conv4_block7_3_bn (BatchNormal  (None, 3, 3, 1024)  4096        ['conv4_block7_3_conv[0][0]']    
 ization)                                                                                         
                                                                                                  
 conv4_block7_add (Add)         (None, 3, 3, 1024)   0           ['conv4_block6_out[0][0]',       
                                                                  'conv4_block7_3_bn[0][0]']      
                                                                                                  
 conv4_block7_out (Activation)  (None, 3, 3, 1024)   0           ['conv4_block7_add[0][0]']       
                                                                                                  
 conv4_block8_1_conv (Conv2D)   (None, 3, 3, 256)    262400      ['conv4_block7_out[0][0]']       
                                                                                                  
 conv4_block8_1_bn (BatchNormal  (None, 3, 3, 256)   1024        ['conv4_block8_1_conv[0][0]']    
 ization)                                                                                         
                                                                                                  
 conv4_block8_1_relu (Activatio  (None, 3, 3, 256)   0           ['conv4_block8_1_bn[0][0]']      
 n)                                                                                               
                                                                                                  
 conv4_block8_2_conv (Conv2D)   (None, 3, 3, 256)    590080      ['conv4_block8_1_relu[0][0]']    
                                                                                                  
 conv4_block8_2_bn (BatchNormal  (None, 3, 3, 256)   1024        ['conv4_block8_2_conv[0][0]']    
 ization)                                                                                         
                                                                                                  
 conv4_block8_2_relu (Activatio  (None, 3, 3, 256)   0           ['conv4_block8_2_bn[0][0]']      
 n)                                                                                               
                                                                                                  
 conv4_block8_3_conv (Conv2D)   (None, 3, 3, 1024)   263168      ['conv4_block8_2_relu[0][0]']    
                                                                                                  
 conv4_block8_3_bn (BatchNormal  (None, 3, 3, 1024)  4096        ['conv4_block8_3_conv[0][0]']    
 ization)                                                                                         
                                                                                                  
 conv4_block8_add (Add)         (None, 3, 3, 1024)   0           ['conv4_block7_out[0][0]',       
                                                                  'conv4_block8_3_bn[0][0]']      
                                                                                                  
 conv4_block8_out (Activation)  (None, 3, 3, 1024)   0           ['conv4_block8_add[0][0]']       
                                                                                                  
 conv4_block9_1_conv (Conv2D)   (None, 3, 3, 256)    262400      ['conv4_block8_out[0][0]']       
                                                                                                  
 conv4_block9_1_bn (BatchNormal  (None, 3, 3, 256)   1024        ['conv4_block9_1_conv[0][0]']    
 ization)                                                                                         
                                                                                                  
 conv4_block9_1_relu (Activatio  (None, 3, 3, 256)   0           ['conv4_block9_1_bn[0][0]']      
 n)                                                                                               
                                                                                                  
 conv4_block9_2_conv (Conv2D)   (None, 3, 3, 256)    590080      ['conv4_block9_1_relu[0][0]']    
                                                                                                  
 conv4_block9_2_bn (BatchNormal  (None, 3, 3, 256)   1024        ['conv4_block9_2_conv[0][0]']    
 ization)                                                                                         
                                                                                                  
 conv4_block9_2_relu (Activatio  (None, 3, 3, 256)   0           ['conv4_block9_2_bn[0][0]']      
 n)                                                                                               
                                                                                                  
 conv4_block9_3_conv (Conv2D)   (None, 3, 3, 1024)   263168      ['conv4_block9_2_relu[0][0]']    
                                                                                                  
 conv4_block9_3_bn (BatchNormal  (None, 3, 3, 1024)  4096        ['conv4_block9_3_conv[0][0]']    
 ization)                                                                                         
                                                                                                  
 conv4_block9_add (Add)         (None, 3, 3, 1024)   0           ['conv4_block8_out[0][0]',       
                                                                  'conv4_block9_3_bn[0][0]']      
                                                                                                  
 conv4_block9_out (Activation)  (None, 3, 3, 1024)   0           ['conv4_block9_add[0][0]']       
                                                                                                  
 conv4_block10_1_conv (Conv2D)  (None, 3, 3, 256)    262400      ['conv4_block9_out[0][0]']       
                                                                                                  
 conv4_block10_1_bn (BatchNorma  (None, 3, 3, 256)   1024        ['conv4_block10_1_conv[0][0]']   
 lization)                                                                                        
                                                                                                  
 conv4_block10_1_relu (Activati  (None, 3, 3, 256)   0           ['conv4_block10_1_bn[0][0]']     
 on)                                                                                              
                                                                                                  
 conv4_block10_2_conv (Conv2D)  (None, 3, 3, 256)    590080      ['conv4_block10_1_relu[0][0]']   
                                                                                                  
 conv4_block10_2_bn (BatchNorma  (None, 3, 3, 256)   1024        ['conv4_block10_2_conv[0][0]']   
 lization)                                                                                        
                                                                                                  
 conv4_block10_2_relu (Activati  (None, 3, 3, 256)   0           ['conv4_block10_2_bn[0][0]']     
 on)                                                                                              
                                                                                                  
 conv4_block10_3_conv (Conv2D)  (None, 3, 3, 1024)   263168      ['conv4_block10_2_relu[0][0]']   
                                                                                                  
 conv4_block10_3_bn (BatchNorma  (None, 3, 3, 1024)  4096        ['conv4_block10_3_conv[0][0]']   
 lization)                                                                                        
                                                                                                  
 conv4_block10_add (Add)        (None, 3, 3, 1024)   0           ['conv4_block9_out[0][0]',       
                                                                  'conv4_block10_3_bn[0][0]']     
                                                                                                  
 conv4_block10_out (Activation)  (None, 3, 3, 1024)  0           ['conv4_block10_add[0][0]']      
                                                                                                  
 conv4_block11_1_conv (Conv2D)  (None, 3, 3, 256)    262400      ['conv4_block10_out[0][0]']      
                                                                                                  
 conv4_block11_1_bn (BatchNorma  (None, 3, 3, 256)   1024        ['conv4_block11_1_conv[0][0]']   
 lization)                                                                                        
                                                                                                  
 conv4_block11_1_relu (Activati  (None, 3, 3, 256)   0           ['conv4_block11_1_bn[0][0]']     
 on)                                                                                              
                                                                                                  
 conv4_block11_2_conv (Conv2D)  (None, 3, 3, 256)    590080      ['conv4_block11_1_relu[0][0]']   
                                                                                                  
 conv4_block11_2_bn (BatchNorma  (None, 3, 3, 256)   1024        ['conv4_block11_2_conv[0][0]']   
 lization)                                                                                        
                                                                                                  
 conv4_block11_2_relu (Activati  (None, 3, 3, 256)   0           ['conv4_block11_2_bn[0][0]']     
 on)                                                                                              
                                                                                                  
 conv4_block11_3_conv (Conv2D)  (None, 3, 3, 1024)   263168      ['conv4_block11_2_relu[0][0]']   
                                                                                                  
 conv4_block11_3_bn (BatchNorma  (None, 3, 3, 1024)  4096        ['conv4_block11_3_conv[0][0]']   
 lization)                                                                                        
                                                                                                  
 conv4_block11_add (Add)        (None, 3, 3, 1024)   0           ['conv4_block10_out[0][0]',      
                                                                  'conv4_block11_3_bn[0][0]']     
                                                                                                  
 conv4_block11_out (Activation)  (None, 3, 3, 1024)  0           ['conv4_block11_add[0][0]']      
                                                                                                  
 conv4_block12_1_conv (Conv2D)  (None, 3, 3, 256)    262400      ['conv4_block11_out[0][0]']      
                                                                                                  
 conv4_block12_1_bn (BatchNorma  (None, 3, 3, 256)   1024        ['conv4_block12_1_conv[0][0]']   
 lization)                                                                                        
                                                                                                  
 conv4_block12_1_relu (Activati  (None, 3, 3, 256)   0           ['conv4_block12_1_bn[0][0]']     
 on)                                                                                              
                                                                                                  
 conv4_block12_2_conv (Conv2D)  (None, 3, 3, 256)    590080      ['conv4_block12_1_relu[0][0]']   
                                                                                                  
 conv4_block12_2_bn (BatchNorma  (None, 3, 3, 256)   1024        ['conv4_block12_2_conv[0][0]']   
 lization)                                                                                        
                                                                                                  
 conv4_block12_2_relu (Activati  (None, 3, 3, 256)   0           ['conv4_block12_2_bn[0][0]']     
 on)                                                                                              
                                                                                                  
 conv4_block12_3_conv (Conv2D)  (None, 3, 3, 1024)   263168      ['conv4_block12_2_relu[0][0]']   
                                                                                                  
 conv4_block12_3_bn (BatchNorma  (None, 3, 3, 1024)  4096        ['conv4_block12_3_conv[0][0]']   
 lization)                                                                                        
                                                                                                  
 conv4_block12_add (Add)        (None, 3, 3, 1024)   0           ['conv4_block11_out[0][0]',      
                                                                  'conv4_block12_3_bn[0][0]']     
                                                                                                  
 conv4_block12_out (Activation)  (None, 3, 3, 1024)  0           ['conv4_block12_add[0][0]']      
                                                                                                  
 conv4_block13_1_conv (Conv2D)  (None, 3, 3, 256)    262400      ['conv4_block12_out[0][0]']      
                                                                                                  
 conv4_block13_1_bn (BatchNorma  (None, 3, 3, 256)   1024        ['conv4_block13_1_conv[0][0]']   
 lization)                                                                                        
                                                                                                  
 conv4_block13_1_relu (Activati  (None, 3, 3, 256)   0           ['conv4_block13_1_bn[0][0]']     
 on)                                                                                              
                                                                                                  
 conv4_block13_2_conv (Conv2D)  (None, 3, 3, 256)    590080      ['conv4_block13_1_relu[0][0]']   
                                                                                                  
 conv4_block13_2_bn (BatchNorma  (None, 3, 3, 256)   1024        ['conv4_block13_2_conv[0][0]']   
 lization)                                                                                        
                                                                                                  
 conv4_block13_2_relu (Activati  (None, 3, 3, 256)   0           ['conv4_block13_2_bn[0][0]']     
 on)                                                                                              
                                                                                                  
 conv4_block13_3_conv (Conv2D)  (None, 3, 3, 1024)   263168      ['conv4_block13_2_relu[0][0]']   
                                                                                                  
 conv4_block13_3_bn (BatchNorma  (None, 3, 3, 1024)  4096        ['conv4_block13_3_conv[0][0]']   
 lization)                                                                                        
                                                                                                  
 conv4_block13_add (Add)        (None, 3, 3, 1024)   0           ['conv4_block12_out[0][0]',      
                                                                  'conv4_block13_3_bn[0][0]']     
                                                                                                  
 conv4_block13_out (Activation)  (None, 3, 3, 1024)  0           ['conv4_block13_add[0][0]']      
                                                                                                  
 conv4_block14_1_conv (Conv2D)  (None, 3, 3, 256)    262400      ['conv4_block13_out[0][0]']      
                                                                                                  
 conv4_block14_1_bn (BatchNorma  (None, 3, 3, 256)   1024        ['conv4_block14_1_conv[0][0]']   
 lization)                                                                                        
                                                                                                  
 conv4_block14_1_relu (Activati  (None, 3, 3, 256)   0           ['conv4_block14_1_bn[0][0]']     
 on)                                                                                              
                                                                                                  
 conv4_block14_2_conv (Conv2D)  (None, 3, 3, 256)    590080      ['conv4_block14_1_relu[0][0]']   
                                                                                                  
 conv4_block14_2_bn (BatchNorma  (None, 3, 3, 256)   1024        ['conv4_block14_2_conv[0][0]']   
 lization)                                                                                        
                                                                                                  
 conv4_block14_2_relu (Activati  (None, 3, 3, 256)   0           ['conv4_block14_2_bn[0][0]']     
 on)                                                                                              
                                                                                                  
 conv4_block14_3_conv (Conv2D)  (None, 3, 3, 1024)   263168      ['conv4_block14_2_relu[0][0]']   
                                                                                                  
 conv4_block14_3_bn (BatchNorma  (None, 3, 3, 1024)  4096        ['conv4_block14_3_conv[0][0]']   
 lization)                                                                                        
                                                                                                  
 conv4_block14_add (Add)        (None, 3, 3, 1024)   0           ['conv4_block13_out[0][0]',      
                                                                  'conv4_block14_3_bn[0][0]']     
                                                                                                  
 conv4_block14_out (Activation)  (None, 3, 3, 1024)  0           ['conv4_block14_add[0][0]']      
                                                                                                  
 conv4_block15_1_conv (Conv2D)  (None, 3, 3, 256)    262400      ['conv4_block14_out[0][0]']      
                                                                                                  
 conv4_block15_1_bn (BatchNorma  (None, 3, 3, 256)   1024        ['conv4_block15_1_conv[0][0]']   
 lization)                                                                                        
                                                                                                  
 conv4_block15_1_relu (Activati  (None, 3, 3, 256)   0           ['conv4_block15_1_bn[0][0]']     
 on)                                                                                              
                                                                                                  
 conv4_block15_2_conv (Conv2D)  (None, 3, 3, 256)    590080      ['conv4_block15_1_relu[0][0]']   
                                                                                                  
 conv4_block15_2_bn (BatchNorma  (None, 3, 3, 256)   1024        ['conv4_block15_2_conv[0][0]']   
 lization)                                                                                        
                                                                                                  
 conv4_block15_2_relu (Activati  (None, 3, 3, 256)   0           ['conv4_block15_2_bn[0][0]']     
 on)                                                                                              
                                                                                                  
 conv4_block15_3_conv (Conv2D)  (None, 3, 3, 1024)   263168      ['conv4_block15_2_relu[0][0]']   
                                                                                                  
 conv4_block15_3_bn (BatchNorma  (None, 3, 3, 1024)  4096        ['conv4_block15_3_conv[0][0]']   
 lization)                                                                                        
                                                                                                  
 conv4_block15_add (Add)        (None, 3, 3, 1024)   0           ['conv4_block14_out[0][0]',      
                                                                  'conv4_block15_3_bn[0][0]']     
                                                                                                  
 conv4_block15_out (Activation)  (None, 3, 3, 1024)  0           ['conv4_block15_add[0][0]']      
                                                                                                  
 conv4_block16_1_conv (Conv2D)  (None, 3, 3, 256)    262400      ['conv4_block15_out[0][0]']      
                                                                                                  
 conv4_block16_1_bn (BatchNorma  (None, 3, 3, 256)   1024        ['conv4_block16_1_conv[0][0]']   
 lization)                                                                                        
                                                                                                  
 conv4_block16_1_relu (Activati  (None, 3, 3, 256)   0           ['conv4_block16_1_bn[0][0]']     
 on)                                                                                              
                                                                                                  
 conv4_block16_2_conv (Conv2D)  (None, 3, 3, 256)    590080      ['conv4_block16_1_relu[0][0]']   
                                                                                                  
 conv4_block16_2_bn (BatchNorma  (None, 3, 3, 256)   1024        ['conv4_block16_2_conv[0][0]']   
 lization)                                                                                        
                                                                                                  
 conv4_block16_2_relu (Activati  (None, 3, 3, 256)   0           ['conv4_block16_2_bn[0][0]']     
 on)                                                                                              
                                                                                                  
 conv4_block16_3_conv (Conv2D)  (None, 3, 3, 1024)   263168      ['conv4_block16_2_relu[0][0]']   
                                                                                                  
 conv4_block16_3_bn (BatchNorma  (None, 3, 3, 1024)  4096        ['conv4_block16_3_conv[0][0]']   
 lization)                                                                                        
                                                                                                  
 conv4_block16_add (Add)        (None, 3, 3, 1024)   0           ['conv4_block15_out[0][0]',      
                                                                  'conv4_block16_3_bn[0][0]']     
                                                                                                  
 conv4_block16_out (Activation)  (None, 3, 3, 1024)  0           ['conv4_block16_add[0][0]']      
                                                                                                  
 conv4_block17_1_conv (Conv2D)  (None, 3, 3, 256)    262400      ['conv4_block16_out[0][0]']      
                                                                                                  
 conv4_block17_1_bn (BatchNorma  (None, 3, 3, 256)   1024        ['conv4_block17_1_conv[0][0]']   
 lization)                                                                                        
                                                                                                  
 conv4_block17_1_relu (Activati  (None, 3, 3, 256)   0           ['conv4_block17_1_bn[0][0]']     
 on)                                                                                              
                                                                                                  
 conv4_block17_2_conv (Conv2D)  (None, 3, 3, 256)    590080      ['conv4_block17_1_relu[0][0]']   
                                                                                                  
 conv4_block17_2_bn (BatchNorma  (None, 3, 3, 256)   1024        ['conv4_block17_2_conv[0][0]']   
 lization)                                                                                        
                                                                                                  
 conv4_block17_2_relu (Activati  (None, 3, 3, 256)   0           ['conv4_block17_2_bn[0][0]']     
 on)                                                                                              
                                                                                                  
 conv4_block17_3_conv (Conv2D)  (None, 3, 3, 1024)   263168      ['conv4_block17_2_relu[0][0]']   
                                                                                                  
 conv4_block17_3_bn (BatchNorma  (None, 3, 3, 1024)  4096        ['conv4_block17_3_conv[0][0]']   
 lization)                                                                                        
                                                                                                  
 conv4_block17_add (Add)        (None, 3, 3, 1024)   0           ['conv4_block16_out[0][0]',      
                                                                  'conv4_block17_3_bn[0][0]']     
                                                                                                  
 conv4_block17_out (Activation)  (None, 3, 3, 1024)  0           ['conv4_block17_add[0][0]']      
                                                                                                  
 conv4_block18_1_conv (Conv2D)  (None, 3, 3, 256)    262400      ['conv4_block17_out[0][0]']      
                                                                                                  
 conv4_block18_1_bn (BatchNorma  (None, 3, 3, 256)   1024        ['conv4_block18_1_conv[0][0]']   
 lization)                                                                                        
                                                                                                  
 conv4_block18_1_relu (Activati  (None, 3, 3, 256)   0           ['conv4_block18_1_bn[0][0]']     
 on)                                                                                              
                                                                                                  
 conv4_block18_2_conv (Conv2D)  (None, 3, 3, 256)    590080      ['conv4_block18_1_relu[0][0]']   
                                                                                                  
 conv4_block18_2_bn (BatchNorma  (None, 3, 3, 256)   1024        ['conv4_block18_2_conv[0][0]']   
 lization)                                                                                        
                                                                                                  
 conv4_block18_2_relu (Activati  (None, 3, 3, 256)   0           ['conv4_block18_2_bn[0][0]']     
 on)                                                                                              
                                                                                                  
 conv4_block18_3_conv (Conv2D)  (None, 3, 3, 1024)   263168      ['conv4_block18_2_relu[0][0]']   
                                                                                                  
 conv4_block18_3_bn (BatchNorma  (None, 3, 3, 1024)  4096        ['conv4_block18_3_conv[0][0]']   
 lization)                                                                                        
                                                                                                  
 conv4_block18_add (Add)        (None, 3, 3, 1024)   0           ['conv4_block17_out[0][0]',      
                                                                  'conv4_block18_3_bn[0][0]']     
                                                                                                  
 conv4_block18_out (Activation)  (None, 3, 3, 1024)  0           ['conv4_block18_add[0][0]']      
                                                                                                  
 conv4_block19_1_conv (Conv2D)  (None, 3, 3, 256)    262400      ['conv4_block18_out[0][0]']      
                                                                                                  
 conv4_block19_1_bn (BatchNorma  (None, 3, 3, 256)   1024        ['conv4_block19_1_conv[0][0]']   
 lization)                                                                                        
                                                                                                  
 conv4_block19_1_relu (Activati  (None, 3, 3, 256)   0           ['conv4_block19_1_bn[0][0]']     
 on)                                                                                              
                                                                                                  
 conv4_block19_2_conv (Conv2D)  (None, 3, 3, 256)    590080      ['conv4_block19_1_relu[0][0]']   
                                                                                                  
 conv4_block19_2_bn (BatchNorma  (None, 3, 3, 256)   1024        ['conv4_block19_2_conv[0][0]']   
 lization)                                                                                        
                                                                                                  
 conv4_block19_2_relu (Activati  (None, 3, 3, 256)   0           ['conv4_block19_2_bn[0][0]']     
 on)                                                                                              
                                                                                                  
 conv4_block19_3_conv (Conv2D)  (None, 3, 3, 1024)   263168      ['conv4_block19_2_relu[0][0]']   
                                                                                                  
 conv4_block19_3_bn (BatchNorma  (None, 3, 3, 1024)  4096        ['conv4_block19_3_conv[0][0]']   
 lization)                                                                                        
                                                                                                  
 conv4_block19_add (Add)        (None, 3, 3, 1024)   0           ['conv4_block18_out[0][0]',      
                                                                  'conv4_block19_3_bn[0][0]']     
                                                                                                  
 conv4_block19_out (Activation)  (None, 3, 3, 1024)  0           ['conv4_block19_add[0][0]']      
                                                                                                  
 conv4_block20_1_conv (Conv2D)  (None, 3, 3, 256)    262400      ['conv4_block19_out[0][0]']      
                                                                                                  
 conv4_block20_1_bn (BatchNorma  (None, 3, 3, 256)   1024        ['conv4_block20_1_conv[0][0]']   
 lization)                                                                                        
                                                                                                  
 conv4_block20_1_relu (Activati  (None, 3, 3, 256)   0           ['conv4_block20_1_bn[0][0]']     
 on)                                                                                              
                                                                                                  
 conv4_block20_2_conv (Conv2D)  (None, 3, 3, 256)    590080      ['conv4_block20_1_relu[0][0]']   
                                                                                                  
 conv4_block20_2_bn (BatchNorma  (None, 3, 3, 256)   1024        ['conv4_block20_2_conv[0][0]']   
 lization)                                                                                        
                                                                                                  
 conv4_block20_2_relu (Activati  (None, 3, 3, 256)   0           ['conv4_block20_2_bn[0][0]']     
 on)                                                                                              
                                                                                                  
 conv4_block20_3_conv (Conv2D)  (None, 3, 3, 1024)   263168      ['conv4_block20_2_relu[0][0]']   
                                                                                                  
 conv4_block20_3_bn (BatchNorma  (None, 3, 3, 1024)  4096        ['conv4_block20_3_conv[0][0]']   
 lization)                                                                                        
                                                                                                  
 conv4_block20_add (Add)        (None, 3, 3, 1024)   0           ['conv4_block19_out[0][0]',      
                                                                  'conv4_block20_3_bn[0][0]']     
                                                                                                  
 conv4_block20_out (Activation)  (None, 3, 3, 1024)  0           ['conv4_block20_add[0][0]']      
                                                                                                  
 conv4_block21_1_conv (Conv2D)  (None, 3, 3, 256)    262400      ['conv4_block20_out[0][0]']      
                                                                                                  
 conv4_block21_1_bn (BatchNorma  (None, 3, 3, 256)   1024        ['conv4_block21_1_conv[0][0]']   
 lization)                                                                                        
                                                                                                  
 conv4_block21_1_relu (Activati  (None, 3, 3, 256)   0           ['conv4_block21_1_bn[0][0]']     
 on)                                                                                              
                                                                                                  
 conv4_block21_2_conv (Conv2D)  (None, 3, 3, 256)    590080      ['conv4_block21_1_relu[0][0]']   
                                                                                                  
 conv4_block21_2_bn (BatchNorma  (None, 3, 3, 256)   1024        ['conv4_block21_2_conv[0][0]']   
 lization)                                                                                        
                                                                                                  
 conv4_block21_2_relu (Activati  (None, 3, 3, 256)   0           ['conv4_block21_2_bn[0][0]']     
 on)                                                                                              
                                                                                                  
 conv4_block21_3_conv (Conv2D)  (None, 3, 3, 1024)   263168      ['conv4_block21_2_relu[0][0]']   
                                                                                                  
 conv4_block21_3_bn (BatchNorma  (None, 3, 3, 1024)  4096        ['conv4_block21_3_conv[0][0]']   
 lization)                                                                                        
                                                                                                  
 conv4_block21_add (Add)        (None, 3, 3, 1024)   0           ['conv4_block20_out[0][0]',      
                                                                  'conv4_block21_3_bn[0][0]']     
                                                                                                  
 conv4_block21_out (Activation)  (None, 3, 3, 1024)  0           ['conv4_block21_add[0][0]']      
                                                                                                  
 conv4_block22_1_conv (Conv2D)  (None, 3, 3, 256)    262400      ['conv4_block21_out[0][0]']      
                                                                                                  
 conv4_block22_1_bn (BatchNorma  (None, 3, 3, 256)   1024        ['conv4_block22_1_conv[0][0]']   
 lization)                                                                                        
                                                                                                  
 conv4_block22_1_relu (Activati  (None, 3, 3, 256)   0           ['conv4_block22_1_bn[0][0]']     
 on)                                                                                              
                                                                                                  
 conv4_block22_2_conv (Conv2D)  (None, 3, 3, 256)    590080      ['conv4_block22_1_relu[0][0]']   
                                                                                                  
 conv4_block22_2_bn (BatchNorma  (None, 3, 3, 256)   1024        ['conv4_block22_2_conv[0][0]']   
 lization)                                                                                        
                                                                                                  
 conv4_block22_2_relu (Activati  (None, 3, 3, 256)   0           ['conv4_block22_2_bn[0][0]']     
 on)                                                                                              
                                                                                                  
 conv4_block22_3_conv (Conv2D)  (None, 3, 3, 1024)   263168      ['conv4_block22_2_relu[0][0]']   
                                                                                                  
 conv4_block22_3_bn (BatchNorma  (None, 3, 3, 1024)  4096        ['conv4_block22_3_conv[0][0]']   
 lization)                                                                                        
                                                                                                  
 conv4_block22_add (Add)        (None, 3, 3, 1024)   0           ['conv4_block21_out[0][0]',      
                                                                  'conv4_block22_3_bn[0][0]']     
                                                                                                  
 conv4_block22_out (Activation)  (None, 3, 3, 1024)  0           ['conv4_block22_add[0][0]']      
                                                                                                  
 conv4_block23_1_conv (Conv2D)  (None, 3, 3, 256)    262400      ['conv4_block22_out[0][0]']      
                                                                                                  
 conv4_block23_1_bn (BatchNorma  (None, 3, 3, 256)   1024        ['conv4_block23_1_conv[0][0]']   
 lization)                                                                                        
                                                                                                  
 conv4_block23_1_relu (Activati  (None, 3, 3, 256)   0           ['conv4_block23_1_bn[0][0]']     
 on)                                                                                              
                                                                                                  
 conv4_block23_2_conv (Conv2D)  (None, 3, 3, 256)    590080      ['conv4_block23_1_relu[0][0]']   
                                                                                                  
 conv4_block23_2_bn (BatchNorma  (None, 3, 3, 256)   1024        ['conv4_block23_2_conv[0][0]']   
 lization)                                                                                        
                                                                                                  
 conv4_block23_2_relu (Activati  (None, 3, 3, 256)   0           ['conv4_block23_2_bn[0][0]']     
 on)                                                                                              
                                                                                                  
 conv4_block23_3_conv (Conv2D)  (None, 3, 3, 1024)   263168      ['conv4_block23_2_relu[0][0]']   
                                                                                                  
 conv4_block23_3_bn (BatchNorma  (None, 3, 3, 1024)  4096        ['conv4_block23_3_conv[0][0]']   
 lization)                                                                                        
                                                                                                  
 conv4_block23_add (Add)        (None, 3, 3, 1024)   0           ['conv4_block22_out[0][0]',      
                                                                  'conv4_block23_3_bn[0][0]']     
                                                                                                  
 conv4_block23_out (Activation)  (None, 3, 3, 1024)  0           ['conv4_block23_add[0][0]']      
                                                                                                  
 conv5_block1_1_conv (Conv2D)   (None, 2, 2, 512)    524800      ['conv4_block23_out[0][0]']      
                                                                                                  
 conv5_block1_1_bn (BatchNormal  (None, 2, 2, 512)   2048        ['conv5_block1_1_conv[0][0]']    
 ization)                                                                                         
                                                                                                  
 conv5_block1_1_relu (Activatio  (None, 2, 2, 512)   0           ['conv5_block1_1_bn[0][0]']      
 n)                                                                                               
                                                                                                  
 conv5_block1_2_conv (Conv2D)   (None, 2, 2, 512)    2359808     ['conv5_block1_1_relu[0][0]']    
                                                                                                  
 conv5_block1_2_bn (BatchNormal  (None, 2, 2, 512)   2048        ['conv5_block1_2_conv[0][0]']    
 ization)                                                                                         
                                                                                                  
 conv5_block1_2_relu (Activatio  (None, 2, 2, 512)   0           ['conv5_block1_2_bn[0][0]']      
 n)                                                                                               
                                                                                                  
 conv5_block1_0_conv (Conv2D)   (None, 2, 2, 2048)   2099200     ['conv4_block23_out[0][0]']      
                                                                                                  
 conv5_block1_3_conv (Conv2D)   (None, 2, 2, 2048)   1050624     ['conv5_block1_2_relu[0][0]']    
                                                                                                  
 conv5_block1_0_bn (BatchNormal  (None, 2, 2, 2048)  8192        ['conv5_block1_0_conv[0][0]']    
 ization)                                                                                         
                                                                                                  
 conv5_block1_3_bn (BatchNormal  (None, 2, 2, 2048)  8192        ['conv5_block1_3_conv[0][0]']    
 ization)                                                                                         
                                                                                                  
 conv5_block1_add (Add)         (None, 2, 2, 2048)   0           ['conv5_block1_0_bn[0][0]',      
                                                                  'conv5_block1_3_bn[0][0]']      
                                                                                                  
 conv5_block1_out (Activation)  (None, 2, 2, 2048)   0           ['conv5_block1_add[0][0]']       
                                                                                                  
 conv5_block2_1_conv (Conv2D)   (None, 2, 2, 512)    1049088     ['conv5_block1_out[0][0]']       
                                                                                                  
 conv5_block2_1_bn (BatchNormal  (None, 2, 2, 512)   2048        ['conv5_block2_1_conv[0][0]']    
 ization)                                                                                         
                                                                                                  
 conv5_block2_1_relu (Activatio  (None, 2, 2, 512)   0           ['conv5_block2_1_bn[0][0]']      
 n)                                                                                               
                                                                                                  
 conv5_block2_2_conv (Conv2D)   (None, 2, 2, 512)    2359808     ['conv5_block2_1_relu[0][0]']    
                                                                                                  
 conv5_block2_2_bn (BatchNormal  (None, 2, 2, 512)   2048        ['conv5_block2_2_conv[0][0]']    
 ization)                                                                                         
                                                                                                  
 conv5_block2_2_relu (Activatio  (None, 2, 2, 512)   0           ['conv5_block2_2_bn[0][0]']      
 n)                                                                                               
                                                                                                  
 conv5_block2_3_conv (Conv2D)   (None, 2, 2, 2048)   1050624     ['conv5_block2_2_relu[0][0]']    
                                                                                                  
 conv5_block2_3_bn (BatchNormal  (None, 2, 2, 2048)  8192        ['conv5_block2_3_conv[0][0]']    
 ization)                                                                                         
                                                                                                  
 conv5_block2_add (Add)         (None, 2, 2, 2048)   0           ['conv5_block1_out[0][0]',       
                                                                  'conv5_block2_3_bn[0][0]']      
                                                                                                  
 conv5_block2_out (Activation)  (None, 2, 2, 2048)   0           ['conv5_block2_add[0][0]']       
                                                                                                  
 conv5_block3_1_conv (Conv2D)   (None, 2, 2, 512)    1049088     ['conv5_block2_out[0][0]']       
                                                                                                  
 conv5_block3_1_bn (BatchNormal  (None, 2, 2, 512)   2048        ['conv5_block3_1_conv[0][0]']    
 ization)                                                                                         
                                                                                                  
 conv5_block3_1_relu (Activatio  (None, 2, 2, 512)   0           ['conv5_block3_1_bn[0][0]']      
 n)                                                                                               
                                                                                                  
 conv5_block3_2_conv (Conv2D)   (None, 2, 2, 512)    2359808     ['conv5_block3_1_relu[0][0]']    
                                                                                                  
 conv5_block3_2_bn (BatchNormal  (None, 2, 2, 512)   2048        ['conv5_block3_2_conv[0][0]']    
 ization)                                                                                         
                                                                                                  
 conv5_block3_2_relu (Activatio  (None, 2, 2, 512)   0           ['conv5_block3_2_bn[0][0]']      
 n)                                                                                               
                                                                                                  
 conv5_block3_3_conv (Conv2D)   (None, 2, 2, 2048)   1050624     ['conv5_block3_2_relu[0][0]']    
                                                                                                  
 conv5_block3_3_bn (BatchNormal  (None, 2, 2, 2048)  8192        ['conv5_block3_3_conv[0][0]']    
 ization)                                                                                         
                                                                                                  
 conv5_block3_add (Add)         (None, 2, 2, 2048)   0           ['conv5_block2_out[0][0]',       
                                                                  'conv5_block3_3_bn[0][0]']      
                                                                                                  
 conv5_block3_out (Activation)  (None, 2, 2, 2048)   0           ['conv5_block3_add[0][0]']       
                                                                                                  
==================================================================================================
Total params: 42,658,176
Trainable params: 42,552,832
Non-trainable params: 105,344
__________________________________________________________________________________________________

Model Building¶

We have imported the ResNet v2 model up to layer 'conv_block23_add', as this has shown the best performance compared to other layers (discussed below). The ResNet v2 layers will be frozen, so the only trainable layers will be those we add ourselves. After flattening the input from 'conv_block23_add', we will add the same architecture we did earlier to VGG16, namely 2 dense layers, followed by a DropOut layer, another dense layer, and BatchNormalization. We will once again end with a softmax classifier, as this is a multi-class classification exercise.

In [66]:
transfer_layer = Resnet.get_layer('conv4_block23_add')
Resnet.trainable = False

# Flatten the input
x = Flatten()(transfer_layer.output)

# Dense layers
x = Dense(256, activation='relu')(x)
x = Dense(128, activation='relu')(x)
x = Dropout(0.2)(x)
x = Dense(64, activation='relu')(x)
x = BatchNormalization()(x)

# Classifier
pred = Dense(4, activation='softmax')(x)

# Initialize the model
model_4 = Model(Resnet.input, pred)

Compiling and Training the Model¶

In [67]:
# Creating a checkpoint which saves model weights from the best epoch
checkpoint = ModelCheckpoint('./model_4.h5', monitor='val_accuracy', verbose=1, save_best_only=True, mode='auto')

# Initiates early stopping if validation loss does not continue to improve
early_stopping = EarlyStopping(monitor = 'val_loss',
                              min_delta = 0,
                              patience = 15,    # Increased over initial models otherwise training is cut off too quickly
                              verbose = 1,
                              restore_best_weights = True)

# Initiates reduced learning rate if validation loss does not continue to improve
reduce_learningrate = ReduceLROnPlateau(monitor = 'val_loss',
                                        factor = 0.2,
                                        patience = 3,
                                        verbose = 1,
                                        min_delta = 0.0001)

callbacks_list = [checkpoint, early_stopping, reduce_learningrate]
In [68]:
# Compiling model with optimizer set to Adam, loss set to categorical_crossentropy, and metrics set to accuracy
model_4.compile(optimizer = Adam(learning_rate = 0.001), loss = 'categorical_crossentropy', metrics = ['accuracy'])
In [69]:
# Fitting model with epochs set to 100
history_4 = model_4.fit(train_set_rgb, validation_data = val_set_rgb, epochs = 100, callbacks = callbacks_list)
Epoch 1/100
473/473 [==============================] - ETA: 0s - loss: 1.4286 - accuracy: 0.2619
Epoch 1: val_accuracy improved from -inf to 0.36287, saving model to ./model_4.h5
473/473 [==============================] - 41s 80ms/step - loss: 1.4286 - accuracy: 0.2619 - val_loss: 1.3548 - val_accuracy: 0.3629 - lr: 0.0010
Epoch 2/100
473/473 [==============================] - ETA: 0s - loss: 1.4022 - accuracy: 0.2626
Epoch 2: val_accuracy did not improve from 0.36287
473/473 [==============================] - 32s 66ms/step - loss: 1.4022 - accuracy: 0.2626 - val_loss: 1.4131 - val_accuracy: 0.2443 - lr: 0.0010
Epoch 3/100
473/473 [==============================] - ETA: 0s - loss: 1.3988 - accuracy: 0.2634
Epoch 3: val_accuracy did not improve from 0.36287
473/473 [==============================] - 32s 67ms/step - loss: 1.3988 - accuracy: 0.2634 - val_loss: 1.4782 - val_accuracy: 0.2289 - lr: 0.0010
Epoch 4/100
472/473 [============================>.] - ETA: 0s - loss: 1.3911 - accuracy: 0.2682
Epoch 4: val_accuracy did not improve from 0.36287

Epoch 4: ReduceLROnPlateau reducing learning rate to 0.00020000000949949026.
473/473 [==============================] - 32s 67ms/step - loss: 1.3911 - accuracy: 0.2681 - val_loss: 1.4236 - val_accuracy: 0.2443 - lr: 0.0010
Epoch 5/100
473/473 [==============================] - ETA: 0s - loss: 1.3799 - accuracy: 0.2763
Epoch 5: val_accuracy did not improve from 0.36287
473/473 [==============================] - 33s 70ms/step - loss: 1.3799 - accuracy: 0.2763 - val_loss: 1.4006 - val_accuracy: 0.2329 - lr: 2.0000e-04
Epoch 6/100
472/473 [============================>.] - ETA: 0s - loss: 1.3757 - accuracy: 0.2825
Epoch 6: val_accuracy did not improve from 0.36287
473/473 [==============================] - 32s 68ms/step - loss: 1.3758 - accuracy: 0.2825 - val_loss: 1.4284 - val_accuracy: 0.2524 - lr: 2.0000e-04
Epoch 7/100
473/473 [==============================] - ETA: 0s - loss: 1.3730 - accuracy: 0.2878
Epoch 7: val_accuracy did not improve from 0.36287

Epoch 7: ReduceLROnPlateau reducing learning rate to 4.0000001899898055e-05.
473/473 [==============================] - 32s 68ms/step - loss: 1.3730 - accuracy: 0.2878 - val_loss: 1.3915 - val_accuracy: 0.2443 - lr: 2.0000e-04
Epoch 8/100
473/473 [==============================] - ETA: 0s - loss: 1.3691 - accuracy: 0.2905
Epoch 8: val_accuracy did not improve from 0.36287
473/473 [==============================] - 33s 70ms/step - loss: 1.3691 - accuracy: 0.2905 - val_loss: 1.4117 - val_accuracy: 0.2560 - lr: 4.0000e-05
Epoch 9/100
473/473 [==============================] - ETA: 0s - loss: 1.3658 - accuracy: 0.3011
Epoch 9: val_accuracy did not improve from 0.36287
473/473 [==============================] - 32s 68ms/step - loss: 1.3658 - accuracy: 0.3011 - val_loss: 1.4033 - val_accuracy: 0.2556 - lr: 4.0000e-05
Epoch 10/100
472/473 [============================>.] - ETA: 0s - loss: 1.3664 - accuracy: 0.2946
Epoch 10: val_accuracy did not improve from 0.36287

Epoch 10: ReduceLROnPlateau reducing learning rate to 8.000000525498762e-06.
473/473 [==============================] - 32s 67ms/step - loss: 1.3666 - accuracy: 0.2944 - val_loss: 1.4060 - val_accuracy: 0.2431 - lr: 4.0000e-05
Epoch 11/100
473/473 [==============================] - ETA: 0s - loss: 1.3639 - accuracy: 0.3004
Epoch 11: val_accuracy did not improve from 0.36287
473/473 [==============================] - 33s 71ms/step - loss: 1.3639 - accuracy: 0.3004 - val_loss: 1.3998 - val_accuracy: 0.2558 - lr: 8.0000e-06
Epoch 12/100
473/473 [==============================] - ETA: 0s - loss: 1.3632 - accuracy: 0.2994
Epoch 12: val_accuracy did not improve from 0.36287
473/473 [==============================] - 33s 69ms/step - loss: 1.3632 - accuracy: 0.2994 - val_loss: 1.3965 - val_accuracy: 0.2574 - lr: 8.0000e-06
Epoch 13/100
473/473 [==============================] - ETA: 0s - loss: 1.3662 - accuracy: 0.3015
Epoch 13: val_accuracy did not improve from 0.36287

Epoch 13: ReduceLROnPlateau reducing learning rate to 1.6000001778593287e-06.
473/473 [==============================] - 32s 67ms/step - loss: 1.3662 - accuracy: 0.3015 - val_loss: 1.3940 - val_accuracy: 0.2592 - lr: 8.0000e-06
Epoch 14/100
473/473 [==============================] - ETA: 0s - loss: 1.3655 - accuracy: 0.2984
Epoch 14: val_accuracy did not improve from 0.36287
473/473 [==============================] - 32s 67ms/step - loss: 1.3655 - accuracy: 0.2984 - val_loss: 1.3998 - val_accuracy: 0.2542 - lr: 1.6000e-06
Epoch 15/100
473/473 [==============================] - ETA: 0s - loss: 1.3623 - accuracy: 0.3027
Epoch 15: val_accuracy did not improve from 0.36287
473/473 [==============================] - 33s 71ms/step - loss: 1.3623 - accuracy: 0.3027 - val_loss: 1.3961 - val_accuracy: 0.2495 - lr: 1.6000e-06
Epoch 16/100
473/473 [==============================] - ETA: 0s - loss: 1.3630 - accuracy: 0.2998
Epoch 16: val_accuracy did not improve from 0.36287
Restoring model weights from the end of the best epoch: 1.

Epoch 16: ReduceLROnPlateau reducing learning rate to 3.200000264769187e-07.
473/473 [==============================] - 32s 68ms/step - loss: 1.3630 - accuracy: 0.2998 - val_loss: 1.3992 - val_accuracy: 0.2550 - lr: 1.6000e-06
Epoch 16: early stopping
In [70]:
# Plotting the accuracies

plt.figure(figsize = (10, 5))
plt.plot(history_4.history['accuracy'])
plt.plot(history_4.history['val_accuracy'])
plt.title('Accuracy - ResNet V2 Model')
plt.ylabel('Accuracy')
plt.xlabel('Epoch')
plt.legend(['Training', 'Validation'], loc='lower right')
plt.show()
In [71]:
# Plotting the losses

plt.figure(figsize = (10, 5))
plt.plot(history_4.history['loss'])
plt.plot(history_4.history['val_loss'])
plt.title('Loss - ResNet V2 Model')
plt.ylabel('loss')
plt.xlabel('epoch')
plt.legend(['Training', 'Validation'], loc='upper right')
plt.show()

Evaluating the ResNet Model¶

In [72]:
# Evaluating the model's performance on the test set
accuracy = model_4.evaluate(test_set_rgb)
4/4 [==============================] - 0s 46ms/step - loss: 1.4027 - accuracy: 0.2812

Observations and Insights:
As imported and modified, our transfer learning model shows terrible performance. After just 1 epoch (the 'best' epoch!), training accuracy stands at 0.26 and validation accuracy is 0.36. Accuracy and loss for both training and validation data level off fairly quickly at which point early stopping aborts the training. The above accuracy and loss curves paint the picture of poor model that will not generalize well at all. The model's test accuracy comes in at 0.34.

The ResNet v2 model was ultimately imported up to layer 'conv4_block23_add', as it produced the 'best' performance, though it was difficult to choose. A history of alternative models is below.

Train Loss Train Accuracy Val Loss Val Accuracy
ResNet V2 conv4_block23_add (selected) 1.43 0.26 1.35 0.36
ResNet V2 conv5_block3_add 1.47 0.23 1.43 0.33
ResNet V2 conv3_block4_add 1.49 0.22 1.44 0.33
ResNet V2 conv2_block3_add 1.51 0.21 1.55 0.21


Model 5: EfficientNet¶

Our third transfer learning model is EfficientNet, which is a CNN that uses 'compound scaling' to improve efficiency and, theoretically at least, performance. Like VGG16 and ResNet v2, color_mode must be set to RGB to leverage this pre-trained architecture.

In [73]:
EfficientNet = ap.EfficientNetV2B2(include_top=False, weights="imagenet", input_shape= (48, 48, 3))
EfficientNet.summary()
Metal device set to: Apple M1 Pro
Model: "efficientnetv2-b2"
__________________________________________________________________________________________________
 Layer (type)                   Output Shape         Param #     Connected to                     
==================================================================================================
 input_1 (InputLayer)           [(None, 48, 48, 3)]  0           []                               
                                                                                                  
 rescaling (Rescaling)          (None, 48, 48, 3)    0           ['input_1[0][0]']                
                                                                                                  
 normalization (Normalization)  (None, 48, 48, 3)    0           ['rescaling[0][0]']              
                                                                                                  
 stem_conv (Conv2D)             (None, 24, 24, 32)   864         ['normalization[0][0]']          
                                                                                                  
 stem_bn (BatchNormalization)   (None, 24, 24, 32)   128         ['stem_conv[0][0]']              
                                                                                                  
 stem_activation (Activation)   (None, 24, 24, 32)   0           ['stem_bn[0][0]']                
                                                                                                  
 block1a_project_conv (Conv2D)  (None, 24, 24, 16)   4608        ['stem_activation[0][0]']        
                                                                                                  
 block1a_project_bn (BatchNorma  (None, 24, 24, 16)  64          ['block1a_project_conv[0][0]']   
 lization)                                                                                        
                                                                                                  
 block1a_project_activation (Ac  (None, 24, 24, 16)  0           ['block1a_project_bn[0][0]']     
 tivation)                                                                                        
                                                                                                  
 block1b_project_conv (Conv2D)  (None, 24, 24, 16)   2304        ['block1a_project_activation[0][0
                                                                 ]']                              
                                                                                                  
 block1b_project_bn (BatchNorma  (None, 24, 24, 16)  64          ['block1b_project_conv[0][0]']   
 lization)                                                                                        
                                                                                                  
 block1b_project_activation (Ac  (None, 24, 24, 16)  0           ['block1b_project_bn[0][0]']     
 tivation)                                                                                        
                                                                                                  
 block1b_drop (Dropout)         (None, 24, 24, 16)   0           ['block1b_project_activation[0][0
                                                                 ]']                              
                                                                                                  
 block1b_add (Add)              (None, 24, 24, 16)   0           ['block1b_drop[0][0]',           
                                                                  'block1a_project_activation[0][0
                                                                 ]']                              
                                                                                                  
 block2a_expand_conv (Conv2D)   (None, 12, 12, 64)   9216        ['block1b_add[0][0]']            
                                                                                                  
 block2a_expand_bn (BatchNormal  (None, 12, 12, 64)  256         ['block2a_expand_conv[0][0]']    
 ization)                                                                                         
                                                                                                  
 block2a_expand_activation (Act  (None, 12, 12, 64)  0           ['block2a_expand_bn[0][0]']      
 ivation)                                                                                         
                                                                                                  
 block2a_project_conv (Conv2D)  (None, 12, 12, 32)   2048        ['block2a_expand_activation[0][0]
                                                                 ']                               
                                                                                                  
 block2a_project_bn (BatchNorma  (None, 12, 12, 32)  128         ['block2a_project_conv[0][0]']   
 lization)                                                                                        
                                                                                                  
 block2b_expand_conv (Conv2D)   (None, 12, 12, 128)  36864       ['block2a_project_bn[0][0]']     
                                                                                                  
 block2b_expand_bn (BatchNormal  (None, 12, 12, 128)  512        ['block2b_expand_conv[0][0]']    
 ization)                                                                                         
                                                                                                  
 block2b_expand_activation (Act  (None, 12, 12, 128)  0          ['block2b_expand_bn[0][0]']      
 ivation)                                                                                         
                                                                                                  
 block2b_project_conv (Conv2D)  (None, 12, 12, 32)   4096        ['block2b_expand_activation[0][0]
                                                                 ']                               
                                                                                                  
 block2b_project_bn (BatchNorma  (None, 12, 12, 32)  128         ['block2b_project_conv[0][0]']   
 lization)                                                                                        
                                                                                                  
 block2b_drop (Dropout)         (None, 12, 12, 32)   0           ['block2b_project_bn[0][0]']     
                                                                                                  
 block2b_add (Add)              (None, 12, 12, 32)   0           ['block2b_drop[0][0]',           
                                                                  'block2a_project_bn[0][0]']     
                                                                                                  
 block2c_expand_conv (Conv2D)   (None, 12, 12, 128)  36864       ['block2b_add[0][0]']            
                                                                                                  
 block2c_expand_bn (BatchNormal  (None, 12, 12, 128)  512        ['block2c_expand_conv[0][0]']    
 ization)                                                                                         
                                                                                                  
 block2c_expand_activation (Act  (None, 12, 12, 128)  0          ['block2c_expand_bn[0][0]']      
 ivation)                                                                                         
                                                                                                  
 block2c_project_conv (Conv2D)  (None, 12, 12, 32)   4096        ['block2c_expand_activation[0][0]
                                                                 ']                               
                                                                                                  
 block2c_project_bn (BatchNorma  (None, 12, 12, 32)  128         ['block2c_project_conv[0][0]']   
 lization)                                                                                        
                                                                                                  
 block2c_drop (Dropout)         (None, 12, 12, 32)   0           ['block2c_project_bn[0][0]']     
                                                                                                  
 block2c_add (Add)              (None, 12, 12, 32)   0           ['block2c_drop[0][0]',           
                                                                  'block2b_add[0][0]']            
                                                                                                  
 block3a_expand_conv (Conv2D)   (None, 6, 6, 128)    36864       ['block2c_add[0][0]']            
                                                                                                  
 block3a_expand_bn (BatchNormal  (None, 6, 6, 128)   512         ['block3a_expand_conv[0][0]']    
 ization)                                                                                         
                                                                                                  
 block3a_expand_activation (Act  (None, 6, 6, 128)   0           ['block3a_expand_bn[0][0]']      
 ivation)                                                                                         
                                                                                                  
 block3a_project_conv (Conv2D)  (None, 6, 6, 56)     7168        ['block3a_expand_activation[0][0]
                                                                 ']                               
                                                                                                  
 block3a_project_bn (BatchNorma  (None, 6, 6, 56)    224         ['block3a_project_conv[0][0]']   
 lization)                                                                                        
                                                                                                  
 block3b_expand_conv (Conv2D)   (None, 6, 6, 224)    112896      ['block3a_project_bn[0][0]']     
                                                                                                  
 block3b_expand_bn (BatchNormal  (None, 6, 6, 224)   896         ['block3b_expand_conv[0][0]']    
 ization)                                                                                         
                                                                                                  
 block3b_expand_activation (Act  (None, 6, 6, 224)   0           ['block3b_expand_bn[0][0]']      
 ivation)                                                                                         
                                                                                                  
 block3b_project_conv (Conv2D)  (None, 6, 6, 56)     12544       ['block3b_expand_activation[0][0]
                                                                 ']                               
                                                                                                  
 block3b_project_bn (BatchNorma  (None, 6, 6, 56)    224         ['block3b_project_conv[0][0]']   
 lization)                                                                                        
                                                                                                  
 block3b_drop (Dropout)         (None, 6, 6, 56)     0           ['block3b_project_bn[0][0]']     
                                                                                                  
 block3b_add (Add)              (None, 6, 6, 56)     0           ['block3b_drop[0][0]',           
                                                                  'block3a_project_bn[0][0]']     
                                                                                                  
 block3c_expand_conv (Conv2D)   (None, 6, 6, 224)    112896      ['block3b_add[0][0]']            
                                                                                                  
 block3c_expand_bn (BatchNormal  (None, 6, 6, 224)   896         ['block3c_expand_conv[0][0]']    
 ization)                                                                                         
                                                                                                  
 block3c_expand_activation (Act  (None, 6, 6, 224)   0           ['block3c_expand_bn[0][0]']      
 ivation)                                                                                         
                                                                                                  
 block3c_project_conv (Conv2D)  (None, 6, 6, 56)     12544       ['block3c_expand_activation[0][0]
                                                                 ']                               
                                                                                                  
 block3c_project_bn (BatchNorma  (None, 6, 6, 56)    224         ['block3c_project_conv[0][0]']   
 lization)                                                                                        
                                                                                                  
 block3c_drop (Dropout)         (None, 6, 6, 56)     0           ['block3c_project_bn[0][0]']     
                                                                                                  
 block3c_add (Add)              (None, 6, 6, 56)     0           ['block3c_drop[0][0]',           
                                                                  'block3b_add[0][0]']            
                                                                                                  
 block4a_expand_conv (Conv2D)   (None, 6, 6, 224)    12544       ['block3c_add[0][0]']            
                                                                                                  
 block4a_expand_bn (BatchNormal  (None, 6, 6, 224)   896         ['block4a_expand_conv[0][0]']    
 ization)                                                                                         
                                                                                                  
 block4a_expand_activation (Act  (None, 6, 6, 224)   0           ['block4a_expand_bn[0][0]']      
 ivation)                                                                                         
                                                                                                  
 block4a_dwconv2 (DepthwiseConv  (None, 3, 3, 224)   2016        ['block4a_expand_activation[0][0]
 2D)                                                             ']                               
                                                                                                  
 block4a_bn (BatchNormalization  (None, 3, 3, 224)   896         ['block4a_dwconv2[0][0]']        
 )                                                                                                
                                                                                                  
 block4a_activation (Activation  (None, 3, 3, 224)   0           ['block4a_bn[0][0]']             
 )                                                                                                
                                                                                                  
 block4a_se_squeeze (GlobalAver  (None, 224)         0           ['block4a_activation[0][0]']     
 agePooling2D)                                                                                    
                                                                                                  
 block4a_se_reshape (Reshape)   (None, 1, 1, 224)    0           ['block4a_se_squeeze[0][0]']     
                                                                                                  
 block4a_se_reduce (Conv2D)     (None, 1, 1, 14)     3150        ['block4a_se_reshape[0][0]']     
                                                                                                  
 block4a_se_expand (Conv2D)     (None, 1, 1, 224)    3360        ['block4a_se_reduce[0][0]']      
                                                                                                  
 block4a_se_excite (Multiply)   (None, 3, 3, 224)    0           ['block4a_activation[0][0]',     
                                                                  'block4a_se_expand[0][0]']      
                                                                                                  
 block4a_project_conv (Conv2D)  (None, 3, 3, 104)    23296       ['block4a_se_excite[0][0]']      
                                                                                                  
 block4a_project_bn (BatchNorma  (None, 3, 3, 104)   416         ['block4a_project_conv[0][0]']   
 lization)                                                                                        
                                                                                                  
 block4b_expand_conv (Conv2D)   (None, 3, 3, 416)    43264       ['block4a_project_bn[0][0]']     
                                                                                                  
 block4b_expand_bn (BatchNormal  (None, 3, 3, 416)   1664        ['block4b_expand_conv[0][0]']    
 ization)                                                                                         
                                                                                                  
 block4b_expand_activation (Act  (None, 3, 3, 416)   0           ['block4b_expand_bn[0][0]']      
 ivation)                                                                                         
                                                                                                  
 block4b_dwconv2 (DepthwiseConv  (None, 3, 3, 416)   3744        ['block4b_expand_activation[0][0]
 2D)                                                             ']                               
                                                                                                  
 block4b_bn (BatchNormalization  (None, 3, 3, 416)   1664        ['block4b_dwconv2[0][0]']        
 )                                                                                                
                                                                                                  
 block4b_activation (Activation  (None, 3, 3, 416)   0           ['block4b_bn[0][0]']             
 )                                                                                                
                                                                                                  
 block4b_se_squeeze (GlobalAver  (None, 416)         0           ['block4b_activation[0][0]']     
 agePooling2D)                                                                                    
                                                                                                  
 block4b_se_reshape (Reshape)   (None, 1, 1, 416)    0           ['block4b_se_squeeze[0][0]']     
                                                                                                  
 block4b_se_reduce (Conv2D)     (None, 1, 1, 26)     10842       ['block4b_se_reshape[0][0]']     
                                                                                                  
 block4b_se_expand (Conv2D)     (None, 1, 1, 416)    11232       ['block4b_se_reduce[0][0]']      
                                                                                                  
 block4b_se_excite (Multiply)   (None, 3, 3, 416)    0           ['block4b_activation[0][0]',     
                                                                  'block4b_se_expand[0][0]']      
                                                                                                  
 block4b_project_conv (Conv2D)  (None, 3, 3, 104)    43264       ['block4b_se_excite[0][0]']      
                                                                                                  
 block4b_project_bn (BatchNorma  (None, 3, 3, 104)   416         ['block4b_project_conv[0][0]']   
 lization)                                                                                        
                                                                                                  
 block4b_drop (Dropout)         (None, 3, 3, 104)    0           ['block4b_project_bn[0][0]']     
                                                                                                  
 block4b_add (Add)              (None, 3, 3, 104)    0           ['block4b_drop[0][0]',           
                                                                  'block4a_project_bn[0][0]']     
                                                                                                  
 block4c_expand_conv (Conv2D)   (None, 3, 3, 416)    43264       ['block4b_add[0][0]']            
                                                                                                  
 block4c_expand_bn (BatchNormal  (None, 3, 3, 416)   1664        ['block4c_expand_conv[0][0]']    
 ization)                                                                                         
                                                                                                  
 block4c_expand_activation (Act  (None, 3, 3, 416)   0           ['block4c_expand_bn[0][0]']      
 ivation)                                                                                         
                                                                                                  
 block4c_dwconv2 (DepthwiseConv  (None, 3, 3, 416)   3744        ['block4c_expand_activation[0][0]
 2D)                                                             ']                               
                                                                                                  
 block4c_bn (BatchNormalization  (None, 3, 3, 416)   1664        ['block4c_dwconv2[0][0]']        
 )                                                                                                
                                                                                                  
 block4c_activation (Activation  (None, 3, 3, 416)   0           ['block4c_bn[0][0]']             
 )                                                                                                
                                                                                                  
 block4c_se_squeeze (GlobalAver  (None, 416)         0           ['block4c_activation[0][0]']     
 agePooling2D)                                                                                    
                                                                                                  
 block4c_se_reshape (Reshape)   (None, 1, 1, 416)    0           ['block4c_se_squeeze[0][0]']     
                                                                                                  
 block4c_se_reduce (Conv2D)     (None, 1, 1, 26)     10842       ['block4c_se_reshape[0][0]']     
                                                                                                  
 block4c_se_expand (Conv2D)     (None, 1, 1, 416)    11232       ['block4c_se_reduce[0][0]']      
                                                                                                  
 block4c_se_excite (Multiply)   (None, 3, 3, 416)    0           ['block4c_activation[0][0]',     
                                                                  'block4c_se_expand[0][0]']      
                                                                                                  
 block4c_project_conv (Conv2D)  (None, 3, 3, 104)    43264       ['block4c_se_excite[0][0]']      
                                                                                                  
 block4c_project_bn (BatchNorma  (None, 3, 3, 104)   416         ['block4c_project_conv[0][0]']   
 lization)                                                                                        
                                                                                                  
 block4c_drop (Dropout)         (None, 3, 3, 104)    0           ['block4c_project_bn[0][0]']     
                                                                                                  
 block4c_add (Add)              (None, 3, 3, 104)    0           ['block4c_drop[0][0]',           
                                                                  'block4b_add[0][0]']            
                                                                                                  
 block4d_expand_conv (Conv2D)   (None, 3, 3, 416)    43264       ['block4c_add[0][0]']            
                                                                                                  
 block4d_expand_bn (BatchNormal  (None, 3, 3, 416)   1664        ['block4d_expand_conv[0][0]']    
 ization)                                                                                         
                                                                                                  
 block4d_expand_activation (Act  (None, 3, 3, 416)   0           ['block4d_expand_bn[0][0]']      
 ivation)                                                                                         
                                                                                                  
 block4d_dwconv2 (DepthwiseConv  (None, 3, 3, 416)   3744        ['block4d_expand_activation[0][0]
 2D)                                                             ']                               
                                                                                                  
 block4d_bn (BatchNormalization  (None, 3, 3, 416)   1664        ['block4d_dwconv2[0][0]']        
 )                                                                                                
                                                                                                  
 block4d_activation (Activation  (None, 3, 3, 416)   0           ['block4d_bn[0][0]']             
 )                                                                                                
                                                                                                  
 block4d_se_squeeze (GlobalAver  (None, 416)         0           ['block4d_activation[0][0]']     
 agePooling2D)                                                                                    
                                                                                                  
 block4d_se_reshape (Reshape)   (None, 1, 1, 416)    0           ['block4d_se_squeeze[0][0]']     
                                                                                                  
 block4d_se_reduce (Conv2D)     (None, 1, 1, 26)     10842       ['block4d_se_reshape[0][0]']     
                                                                                                  
 block4d_se_expand (Conv2D)     (None, 1, 1, 416)    11232       ['block4d_se_reduce[0][0]']      
                                                                                                  
 block4d_se_excite (Multiply)   (None, 3, 3, 416)    0           ['block4d_activation[0][0]',     
                                                                  'block4d_se_expand[0][0]']      
                                                                                                  
 block4d_project_conv (Conv2D)  (None, 3, 3, 104)    43264       ['block4d_se_excite[0][0]']      
                                                                                                  
 block4d_project_bn (BatchNorma  (None, 3, 3, 104)   416         ['block4d_project_conv[0][0]']   
 lization)                                                                                        
                                                                                                  
 block4d_drop (Dropout)         (None, 3, 3, 104)    0           ['block4d_project_bn[0][0]']     
                                                                                                  
 block4d_add (Add)              (None, 3, 3, 104)    0           ['block4d_drop[0][0]',           
                                                                  'block4c_add[0][0]']            
                                                                                                  
 block5a_expand_conv (Conv2D)   (None, 3, 3, 624)    64896       ['block4d_add[0][0]']            
                                                                                                  
 block5a_expand_bn (BatchNormal  (None, 3, 3, 624)   2496        ['block5a_expand_conv[0][0]']    
 ization)                                                                                         
                                                                                                  
 block5a_expand_activation (Act  (None, 3, 3, 624)   0           ['block5a_expand_bn[0][0]']      
 ivation)                                                                                         
                                                                                                  
 block5a_dwconv2 (DepthwiseConv  (None, 3, 3, 624)   5616        ['block5a_expand_activation[0][0]
 2D)                                                             ']                               
                                                                                                  
 block5a_bn (BatchNormalization  (None, 3, 3, 624)   2496        ['block5a_dwconv2[0][0]']        
 )                                                                                                
                                                                                                  
 block5a_activation (Activation  (None, 3, 3, 624)   0           ['block5a_bn[0][0]']             
 )                                                                                                
                                                                                                  
 block5a_se_squeeze (GlobalAver  (None, 624)         0           ['block5a_activation[0][0]']     
 agePooling2D)                                                                                    
                                                                                                  
 block5a_se_reshape (Reshape)   (None, 1, 1, 624)    0           ['block5a_se_squeeze[0][0]']     
                                                                                                  
 block5a_se_reduce (Conv2D)     (None, 1, 1, 26)     16250       ['block5a_se_reshape[0][0]']     
                                                                                                  
 block5a_se_expand (Conv2D)     (None, 1, 1, 624)    16848       ['block5a_se_reduce[0][0]']      
                                                                                                  
 block5a_se_excite (Multiply)   (None, 3, 3, 624)    0           ['block5a_activation[0][0]',     
                                                                  'block5a_se_expand[0][0]']      
                                                                                                  
 block5a_project_conv (Conv2D)  (None, 3, 3, 120)    74880       ['block5a_se_excite[0][0]']      
                                                                                                  
 block5a_project_bn (BatchNorma  (None, 3, 3, 120)   480         ['block5a_project_conv[0][0]']   
 lization)                                                                                        
                                                                                                  
 block5b_expand_conv (Conv2D)   (None, 3, 3, 720)    86400       ['block5a_project_bn[0][0]']     
                                                                                                  
 block5b_expand_bn (BatchNormal  (None, 3, 3, 720)   2880        ['block5b_expand_conv[0][0]']    
 ization)                                                                                         
                                                                                                  
 block5b_expand_activation (Act  (None, 3, 3, 720)   0           ['block5b_expand_bn[0][0]']      
 ivation)                                                                                         
                                                                                                  
 block5b_dwconv2 (DepthwiseConv  (None, 3, 3, 720)   6480        ['block5b_expand_activation[0][0]
 2D)                                                             ']                               
                                                                                                  
 block5b_bn (BatchNormalization  (None, 3, 3, 720)   2880        ['block5b_dwconv2[0][0]']        
 )                                                                                                
                                                                                                  
 block5b_activation (Activation  (None, 3, 3, 720)   0           ['block5b_bn[0][0]']             
 )                                                                                                
                                                                                                  
 block5b_se_squeeze (GlobalAver  (None, 720)         0           ['block5b_activation[0][0]']     
 agePooling2D)                                                                                    
                                                                                                  
 block5b_se_reshape (Reshape)   (None, 1, 1, 720)    0           ['block5b_se_squeeze[0][0]']     
                                                                                                  
 block5b_se_reduce (Conv2D)     (None, 1, 1, 30)     21630       ['block5b_se_reshape[0][0]']     
                                                                                                  
 block5b_se_expand (Conv2D)     (None, 1, 1, 720)    22320       ['block5b_se_reduce[0][0]']      
                                                                                                  
 block5b_se_excite (Multiply)   (None, 3, 3, 720)    0           ['block5b_activation[0][0]',     
                                                                  'block5b_se_expand[0][0]']      
                                                                                                  
 block5b_project_conv (Conv2D)  (None, 3, 3, 120)    86400       ['block5b_se_excite[0][0]']      
                                                                                                  
 block5b_project_bn (BatchNorma  (None, 3, 3, 120)   480         ['block5b_project_conv[0][0]']   
 lization)                                                                                        
                                                                                                  
 block5b_drop (Dropout)         (None, 3, 3, 120)    0           ['block5b_project_bn[0][0]']     
                                                                                                  
 block5b_add (Add)              (None, 3, 3, 120)    0           ['block5b_drop[0][0]',           
                                                                  'block5a_project_bn[0][0]']     
                                                                                                  
 block5c_expand_conv (Conv2D)   (None, 3, 3, 720)    86400       ['block5b_add[0][0]']            
                                                                                                  
 block5c_expand_bn (BatchNormal  (None, 3, 3, 720)   2880        ['block5c_expand_conv[0][0]']    
 ization)                                                                                         
                                                                                                  
 block5c_expand_activation (Act  (None, 3, 3, 720)   0           ['block5c_expand_bn[0][0]']      
 ivation)                                                                                         
                                                                                                  
 block5c_dwconv2 (DepthwiseConv  (None, 3, 3, 720)   6480        ['block5c_expand_activation[0][0]
 2D)                                                             ']                               
                                                                                                  
 block5c_bn (BatchNormalization  (None, 3, 3, 720)   2880        ['block5c_dwconv2[0][0]']        
 )                                                                                                
                                                                                                  
 block5c_activation (Activation  (None, 3, 3, 720)   0           ['block5c_bn[0][0]']             
 )                                                                                                
                                                                                                  
 block5c_se_squeeze (GlobalAver  (None, 720)         0           ['block5c_activation[0][0]']     
 agePooling2D)                                                                                    
                                                                                                  
 block5c_se_reshape (Reshape)   (None, 1, 1, 720)    0           ['block5c_se_squeeze[0][0]']     
                                                                                                  
 block5c_se_reduce (Conv2D)     (None, 1, 1, 30)     21630       ['block5c_se_reshape[0][0]']     
                                                                                                  
 block5c_se_expand (Conv2D)     (None, 1, 1, 720)    22320       ['block5c_se_reduce[0][0]']      
                                                                                                  
 block5c_se_excite (Multiply)   (None, 3, 3, 720)    0           ['block5c_activation[0][0]',     
                                                                  'block5c_se_expand[0][0]']      
                                                                                                  
 block5c_project_conv (Conv2D)  (None, 3, 3, 120)    86400       ['block5c_se_excite[0][0]']      
                                                                                                  
 block5c_project_bn (BatchNorma  (None, 3, 3, 120)   480         ['block5c_project_conv[0][0]']   
 lization)                                                                                        
                                                                                                  
 block5c_drop (Dropout)         (None, 3, 3, 120)    0           ['block5c_project_bn[0][0]']     
                                                                                                  
 block5c_add (Add)              (None, 3, 3, 120)    0           ['block5c_drop[0][0]',           
                                                                  'block5b_add[0][0]']            
                                                                                                  
 block5d_expand_conv (Conv2D)   (None, 3, 3, 720)    86400       ['block5c_add[0][0]']            
                                                                                                  
 block5d_expand_bn (BatchNormal  (None, 3, 3, 720)   2880        ['block5d_expand_conv[0][0]']    
 ization)                                                                                         
                                                                                                  
 block5d_expand_activation (Act  (None, 3, 3, 720)   0           ['block5d_expand_bn[0][0]']      
 ivation)                                                                                         
                                                                                                  
 block5d_dwconv2 (DepthwiseConv  (None, 3, 3, 720)   6480        ['block5d_expand_activation[0][0]
 2D)                                                             ']                               
                                                                                                  
 block5d_bn (BatchNormalization  (None, 3, 3, 720)   2880        ['block5d_dwconv2[0][0]']        
 )                                                                                                
                                                                                                  
 block5d_activation (Activation  (None, 3, 3, 720)   0           ['block5d_bn[0][0]']             
 )                                                                                                
                                                                                                  
 block5d_se_squeeze (GlobalAver  (None, 720)         0           ['block5d_activation[0][0]']     
 agePooling2D)                                                                                    
                                                                                                  
 block5d_se_reshape (Reshape)   (None, 1, 1, 720)    0           ['block5d_se_squeeze[0][0]']     
                                                                                                  
 block5d_se_reduce (Conv2D)     (None, 1, 1, 30)     21630       ['block5d_se_reshape[0][0]']     
                                                                                                  
 block5d_se_expand (Conv2D)     (None, 1, 1, 720)    22320       ['block5d_se_reduce[0][0]']      
                                                                                                  
 block5d_se_excite (Multiply)   (None, 3, 3, 720)    0           ['block5d_activation[0][0]',     
                                                                  'block5d_se_expand[0][0]']      
                                                                                                  
 block5d_project_conv (Conv2D)  (None, 3, 3, 120)    86400       ['block5d_se_excite[0][0]']      
                                                                                                  
 block5d_project_bn (BatchNorma  (None, 3, 3, 120)   480         ['block5d_project_conv[0][0]']   
 lization)                                                                                        
                                                                                                  
 block5d_drop (Dropout)         (None, 3, 3, 120)    0           ['block5d_project_bn[0][0]']     
                                                                                                  
 block5d_add (Add)              (None, 3, 3, 120)    0           ['block5d_drop[0][0]',           
                                                                  'block5c_add[0][0]']            
                                                                                                  
 block5e_expand_conv (Conv2D)   (None, 3, 3, 720)    86400       ['block5d_add[0][0]']            
                                                                                                  
 block5e_expand_bn (BatchNormal  (None, 3, 3, 720)   2880        ['block5e_expand_conv[0][0]']    
 ization)                                                                                         
                                                                                                  
 block5e_expand_activation (Act  (None, 3, 3, 720)   0           ['block5e_expand_bn[0][0]']      
 ivation)                                                                                         
                                                                                                  
 block5e_dwconv2 (DepthwiseConv  (None, 3, 3, 720)   6480        ['block5e_expand_activation[0][0]
 2D)                                                             ']                               
                                                                                                  
 block5e_bn (BatchNormalization  (None, 3, 3, 720)   2880        ['block5e_dwconv2[0][0]']        
 )                                                                                                
                                                                                                  
 block5e_activation (Activation  (None, 3, 3, 720)   0           ['block5e_bn[0][0]']             
 )                                                                                                
                                                                                                  
 block5e_se_squeeze (GlobalAver  (None, 720)         0           ['block5e_activation[0][0]']     
 agePooling2D)                                                                                    
                                                                                                  
 block5e_se_reshape (Reshape)   (None, 1, 1, 720)    0           ['block5e_se_squeeze[0][0]']     
                                                                                                  
 block5e_se_reduce (Conv2D)     (None, 1, 1, 30)     21630       ['block5e_se_reshape[0][0]']     
                                                                                                  
 block5e_se_expand (Conv2D)     (None, 1, 1, 720)    22320       ['block5e_se_reduce[0][0]']      
                                                                                                  
 block5e_se_excite (Multiply)   (None, 3, 3, 720)    0           ['block5e_activation[0][0]',     
                                                                  'block5e_se_expand[0][0]']      
                                                                                                  
 block5e_project_conv (Conv2D)  (None, 3, 3, 120)    86400       ['block5e_se_excite[0][0]']      
                                                                                                  
 block5e_project_bn (BatchNorma  (None, 3, 3, 120)   480         ['block5e_project_conv[0][0]']   
 lization)                                                                                        
                                                                                                  
 block5e_drop (Dropout)         (None, 3, 3, 120)    0           ['block5e_project_bn[0][0]']     
                                                                                                  
 block5e_add (Add)              (None, 3, 3, 120)    0           ['block5e_drop[0][0]',           
                                                                  'block5d_add[0][0]']            
                                                                                                  
 block5f_expand_conv (Conv2D)   (None, 3, 3, 720)    86400       ['block5e_add[0][0]']            
                                                                                                  
 block5f_expand_bn (BatchNormal  (None, 3, 3, 720)   2880        ['block5f_expand_conv[0][0]']    
 ization)                                                                                         
                                                                                                  
 block5f_expand_activation (Act  (None, 3, 3, 720)   0           ['block5f_expand_bn[0][0]']      
 ivation)                                                                                         
                                                                                                  
 block5f_dwconv2 (DepthwiseConv  (None, 3, 3, 720)   6480        ['block5f_expand_activation[0][0]
 2D)                                                             ']                               
                                                                                                  
 block5f_bn (BatchNormalization  (None, 3, 3, 720)   2880        ['block5f_dwconv2[0][0]']        
 )                                                                                                
                                                                                                  
 block5f_activation (Activation  (None, 3, 3, 720)   0           ['block5f_bn[0][0]']             
 )                                                                                                
                                                                                                  
 block5f_se_squeeze (GlobalAver  (None, 720)         0           ['block5f_activation[0][0]']     
 agePooling2D)                                                                                    
                                                                                                  
 block5f_se_reshape (Reshape)   (None, 1, 1, 720)    0           ['block5f_se_squeeze[0][0]']     
                                                                                                  
 block5f_se_reduce (Conv2D)     (None, 1, 1, 30)     21630       ['block5f_se_reshape[0][0]']     
                                                                                                  
 block5f_se_expand (Conv2D)     (None, 1, 1, 720)    22320       ['block5f_se_reduce[0][0]']      
                                                                                                  
 block5f_se_excite (Multiply)   (None, 3, 3, 720)    0           ['block5f_activation[0][0]',     
                                                                  'block5f_se_expand[0][0]']      
                                                                                                  
 block5f_project_conv (Conv2D)  (None, 3, 3, 120)    86400       ['block5f_se_excite[0][0]']      
                                                                                                  
 block5f_project_bn (BatchNorma  (None, 3, 3, 120)   480         ['block5f_project_conv[0][0]']   
 lization)                                                                                        
                                                                                                  
 block5f_drop (Dropout)         (None, 3, 3, 120)    0           ['block5f_project_bn[0][0]']     
                                                                                                  
 block5f_add (Add)              (None, 3, 3, 120)    0           ['block5f_drop[0][0]',           
                                                                  'block5e_add[0][0]']            
                                                                                                  
 block6a_expand_conv (Conv2D)   (None, 3, 3, 720)    86400       ['block5f_add[0][0]']            
                                                                                                  
 block6a_expand_bn (BatchNormal  (None, 3, 3, 720)   2880        ['block6a_expand_conv[0][0]']    
 ization)                                                                                         
                                                                                                  
 block6a_expand_activation (Act  (None, 3, 3, 720)   0           ['block6a_expand_bn[0][0]']      
 ivation)                                                                                         
                                                                                                  
 block6a_dwconv2 (DepthwiseConv  (None, 2, 2, 720)   6480        ['block6a_expand_activation[0][0]
 2D)                                                             ']                               
                                                                                                  
 block6a_bn (BatchNormalization  (None, 2, 2, 720)   2880        ['block6a_dwconv2[0][0]']        
 )                                                                                                
                                                                                                  
 block6a_activation (Activation  (None, 2, 2, 720)   0           ['block6a_bn[0][0]']             
 )                                                                                                
                                                                                                  
 block6a_se_squeeze (GlobalAver  (None, 720)         0           ['block6a_activation[0][0]']     
 agePooling2D)                                                                                    
                                                                                                  
 block6a_se_reshape (Reshape)   (None, 1, 1, 720)    0           ['block6a_se_squeeze[0][0]']     
                                                                                                  
 block6a_se_reduce (Conv2D)     (None, 1, 1, 30)     21630       ['block6a_se_reshape[0][0]']     
                                                                                                  
 block6a_se_expand (Conv2D)     (None, 1, 1, 720)    22320       ['block6a_se_reduce[0][0]']      
                                                                                                  
 block6a_se_excite (Multiply)   (None, 2, 2, 720)    0           ['block6a_activation[0][0]',     
                                                                  'block6a_se_expand[0][0]']      
                                                                                                  
 block6a_project_conv (Conv2D)  (None, 2, 2, 208)    149760      ['block6a_se_excite[0][0]']      
                                                                                                  
 block6a_project_bn (BatchNorma  (None, 2, 2, 208)   832         ['block6a_project_conv[0][0]']   
 lization)                                                                                        
                                                                                                  
 block6b_expand_conv (Conv2D)   (None, 2, 2, 1248)   259584      ['block6a_project_bn[0][0]']     
                                                                                                  
 block6b_expand_bn (BatchNormal  (None, 2, 2, 1248)  4992        ['block6b_expand_conv[0][0]']    
 ization)                                                                                         
                                                                                                  
 block6b_expand_activation (Act  (None, 2, 2, 1248)  0           ['block6b_expand_bn[0][0]']      
 ivation)                                                                                         
                                                                                                  
 block6b_dwconv2 (DepthwiseConv  (None, 2, 2, 1248)  11232       ['block6b_expand_activation[0][0]
 2D)                                                             ']                               
                                                                                                  
 block6b_bn (BatchNormalization  (None, 2, 2, 1248)  4992        ['block6b_dwconv2[0][0]']        
 )                                                                                                
                                                                                                  
 block6b_activation (Activation  (None, 2, 2, 1248)  0           ['block6b_bn[0][0]']             
 )                                                                                                
                                                                                                  
 block6b_se_squeeze (GlobalAver  (None, 1248)        0           ['block6b_activation[0][0]']     
 agePooling2D)                                                                                    
                                                                                                  
 block6b_se_reshape (Reshape)   (None, 1, 1, 1248)   0           ['block6b_se_squeeze[0][0]']     
                                                                                                  
 block6b_se_reduce (Conv2D)     (None, 1, 1, 52)     64948       ['block6b_se_reshape[0][0]']     
                                                                                                  
 block6b_se_expand (Conv2D)     (None, 1, 1, 1248)   66144       ['block6b_se_reduce[0][0]']      
                                                                                                  
 block6b_se_excite (Multiply)   (None, 2, 2, 1248)   0           ['block6b_activation[0][0]',     
                                                                  'block6b_se_expand[0][0]']      
                                                                                                  
 block6b_project_conv (Conv2D)  (None, 2, 2, 208)    259584      ['block6b_se_excite[0][0]']      
                                                                                                  
 block6b_project_bn (BatchNorma  (None, 2, 2, 208)   832         ['block6b_project_conv[0][0]']   
 lization)                                                                                        
                                                                                                  
 block6b_drop (Dropout)         (None, 2, 2, 208)    0           ['block6b_project_bn[0][0]']     
                                                                                                  
 block6b_add (Add)              (None, 2, 2, 208)    0           ['block6b_drop[0][0]',           
                                                                  'block6a_project_bn[0][0]']     
                                                                                                  
 block6c_expand_conv (Conv2D)   (None, 2, 2, 1248)   259584      ['block6b_add[0][0]']            
                                                                                                  
 block6c_expand_bn (BatchNormal  (None, 2, 2, 1248)  4992        ['block6c_expand_conv[0][0]']    
 ization)                                                                                         
                                                                                                  
 block6c_expand_activation (Act  (None, 2, 2, 1248)  0           ['block6c_expand_bn[0][0]']      
 ivation)                                                                                         
                                                                                                  
 block6c_dwconv2 (DepthwiseConv  (None, 2, 2, 1248)  11232       ['block6c_expand_activation[0][0]
 2D)                                                             ']                               
                                                                                                  
 block6c_bn (BatchNormalization  (None, 2, 2, 1248)  4992        ['block6c_dwconv2[0][0]']        
 )                                                                                                
                                                                                                  
 block6c_activation (Activation  (None, 2, 2, 1248)  0           ['block6c_bn[0][0]']             
 )                                                                                                
                                                                                                  
 block6c_se_squeeze (GlobalAver  (None, 1248)        0           ['block6c_activation[0][0]']     
 agePooling2D)                                                                                    
                                                                                                  
 block6c_se_reshape (Reshape)   (None, 1, 1, 1248)   0           ['block6c_se_squeeze[0][0]']     
                                                                                                  
 block6c_se_reduce (Conv2D)     (None, 1, 1, 52)     64948       ['block6c_se_reshape[0][0]']     
                                                                                                  
 block6c_se_expand (Conv2D)     (None, 1, 1, 1248)   66144       ['block6c_se_reduce[0][0]']      
                                                                                                  
 block6c_se_excite (Multiply)   (None, 2, 2, 1248)   0           ['block6c_activation[0][0]',     
                                                                  'block6c_se_expand[0][0]']      
                                                                                                  
 block6c_project_conv (Conv2D)  (None, 2, 2, 208)    259584      ['block6c_se_excite[0][0]']      
                                                                                                  
 block6c_project_bn (BatchNorma  (None, 2, 2, 208)   832         ['block6c_project_conv[0][0]']   
 lization)                                                                                        
                                                                                                  
 block6c_drop (Dropout)         (None, 2, 2, 208)    0           ['block6c_project_bn[0][0]']     
                                                                                                  
 block6c_add (Add)              (None, 2, 2, 208)    0           ['block6c_drop[0][0]',           
                                                                  'block6b_add[0][0]']            
                                                                                                  
 block6d_expand_conv (Conv2D)   (None, 2, 2, 1248)   259584      ['block6c_add[0][0]']            
                                                                                                  
 block6d_expand_bn (BatchNormal  (None, 2, 2, 1248)  4992        ['block6d_expand_conv[0][0]']    
 ization)                                                                                         
                                                                                                  
 block6d_expand_activation (Act  (None, 2, 2, 1248)  0           ['block6d_expand_bn[0][0]']      
 ivation)                                                                                         
                                                                                                  
 block6d_dwconv2 (DepthwiseConv  (None, 2, 2, 1248)  11232       ['block6d_expand_activation[0][0]
 2D)                                                             ']                               
                                                                                                  
 block6d_bn (BatchNormalization  (None, 2, 2, 1248)  4992        ['block6d_dwconv2[0][0]']        
 )                                                                                                
                                                                                                  
 block6d_activation (Activation  (None, 2, 2, 1248)  0           ['block6d_bn[0][0]']             
 )                                                                                                
                                                                                                  
 block6d_se_squeeze (GlobalAver  (None, 1248)        0           ['block6d_activation[0][0]']     
 agePooling2D)                                                                                    
                                                                                                  
 block6d_se_reshape (Reshape)   (None, 1, 1, 1248)   0           ['block6d_se_squeeze[0][0]']     
                                                                                                  
 block6d_se_reduce (Conv2D)     (None, 1, 1, 52)     64948       ['block6d_se_reshape[0][0]']     
                                                                                                  
 block6d_se_expand (Conv2D)     (None, 1, 1, 1248)   66144       ['block6d_se_reduce[0][0]']      
                                                                                                  
 block6d_se_excite (Multiply)   (None, 2, 2, 1248)   0           ['block6d_activation[0][0]',     
                                                                  'block6d_se_expand[0][0]']      
                                                                                                  
 block6d_project_conv (Conv2D)  (None, 2, 2, 208)    259584      ['block6d_se_excite[0][0]']      
                                                                                                  
 block6d_project_bn (BatchNorma  (None, 2, 2, 208)   832         ['block6d_project_conv[0][0]']   
 lization)                                                                                        
                                                                                                  
 block6d_drop (Dropout)         (None, 2, 2, 208)    0           ['block6d_project_bn[0][0]']     
                                                                                                  
 block6d_add (Add)              (None, 2, 2, 208)    0           ['block6d_drop[0][0]',           
                                                                  'block6c_add[0][0]']            
                                                                                                  
 block6e_expand_conv (Conv2D)   (None, 2, 2, 1248)   259584      ['block6d_add[0][0]']            
                                                                                                  
 block6e_expand_bn (BatchNormal  (None, 2, 2, 1248)  4992        ['block6e_expand_conv[0][0]']    
 ization)                                                                                         
                                                                                                  
 block6e_expand_activation (Act  (None, 2, 2, 1248)  0           ['block6e_expand_bn[0][0]']      
 ivation)                                                                                         
                                                                                                  
 block6e_dwconv2 (DepthwiseConv  (None, 2, 2, 1248)  11232       ['block6e_expand_activation[0][0]
 2D)                                                             ']                               
                                                                                                  
 block6e_bn (BatchNormalization  (None, 2, 2, 1248)  4992        ['block6e_dwconv2[0][0]']        
 )                                                                                                
                                                                                                  
 block6e_activation (Activation  (None, 2, 2, 1248)  0           ['block6e_bn[0][0]']             
 )                                                                                                
                                                                                                  
 block6e_se_squeeze (GlobalAver  (None, 1248)        0           ['block6e_activation[0][0]']     
 agePooling2D)                                                                                    
                                                                                                  
 block6e_se_reshape (Reshape)   (None, 1, 1, 1248)   0           ['block6e_se_squeeze[0][0]']     
                                                                                                  
 block6e_se_reduce (Conv2D)     (None, 1, 1, 52)     64948       ['block6e_se_reshape[0][0]']     
                                                                                                  
 block6e_se_expand (Conv2D)     (None, 1, 1, 1248)   66144       ['block6e_se_reduce[0][0]']      
                                                                                                  
 block6e_se_excite (Multiply)   (None, 2, 2, 1248)   0           ['block6e_activation[0][0]',     
                                                                  'block6e_se_expand[0][0]']      
                                                                                                  
 block6e_project_conv (Conv2D)  (None, 2, 2, 208)    259584      ['block6e_se_excite[0][0]']      
                                                                                                  
 block6e_project_bn (BatchNorma  (None, 2, 2, 208)   832         ['block6e_project_conv[0][0]']   
 lization)                                                                                        
                                                                                                  
 block6e_drop (Dropout)         (None, 2, 2, 208)    0           ['block6e_project_bn[0][0]']     
                                                                                                  
 block6e_add (Add)              (None, 2, 2, 208)    0           ['block6e_drop[0][0]',           
                                                                  'block6d_add[0][0]']            
                                                                                                  
 block6f_expand_conv (Conv2D)   (None, 2, 2, 1248)   259584      ['block6e_add[0][0]']            
                                                                                                  
 block6f_expand_bn (BatchNormal  (None, 2, 2, 1248)  4992        ['block6f_expand_conv[0][0]']    
 ization)                                                                                         
                                                                                                  
 block6f_expand_activation (Act  (None, 2, 2, 1248)  0           ['block6f_expand_bn[0][0]']      
 ivation)                                                                                         
                                                                                                  
 block6f_dwconv2 (DepthwiseConv  (None, 2, 2, 1248)  11232       ['block6f_expand_activation[0][0]
 2D)                                                             ']                               
                                                                                                  
 block6f_bn (BatchNormalization  (None, 2, 2, 1248)  4992        ['block6f_dwconv2[0][0]']        
 )                                                                                                
                                                                                                  
 block6f_activation (Activation  (None, 2, 2, 1248)  0           ['block6f_bn[0][0]']             
 )                                                                                                
                                                                                                  
 block6f_se_squeeze (GlobalAver  (None, 1248)        0           ['block6f_activation[0][0]']     
 agePooling2D)                                                                                    
                                                                                                  
 block6f_se_reshape (Reshape)   (None, 1, 1, 1248)   0           ['block6f_se_squeeze[0][0]']     
                                                                                                  
 block6f_se_reduce (Conv2D)     (None, 1, 1, 52)     64948       ['block6f_se_reshape[0][0]']     
                                                                                                  
 block6f_se_expand (Conv2D)     (None, 1, 1, 1248)   66144       ['block6f_se_reduce[0][0]']      
                                                                                                  
 block6f_se_excite (Multiply)   (None, 2, 2, 1248)   0           ['block6f_activation[0][0]',     
                                                                  'block6f_se_expand[0][0]']      
                                                                                                  
 block6f_project_conv (Conv2D)  (None, 2, 2, 208)    259584      ['block6f_se_excite[0][0]']      
                                                                                                  
 block6f_project_bn (BatchNorma  (None, 2, 2, 208)   832         ['block6f_project_conv[0][0]']   
 lization)                                                                                        
                                                                                                  
 block6f_drop (Dropout)         (None, 2, 2, 208)    0           ['block6f_project_bn[0][0]']     
                                                                                                  
 block6f_add (Add)              (None, 2, 2, 208)    0           ['block6f_drop[0][0]',           
                                                                  'block6e_add[0][0]']            
                                                                                                  
 block6g_expand_conv (Conv2D)   (None, 2, 2, 1248)   259584      ['block6f_add[0][0]']            
                                                                                                  
 block6g_expand_bn (BatchNormal  (None, 2, 2, 1248)  4992        ['block6g_expand_conv[0][0]']    
 ization)                                                                                         
                                                                                                  
 block6g_expand_activation (Act  (None, 2, 2, 1248)  0           ['block6g_expand_bn[0][0]']      
 ivation)                                                                                         
                                                                                                  
 block6g_dwconv2 (DepthwiseConv  (None, 2, 2, 1248)  11232       ['block6g_expand_activation[0][0]
 2D)                                                             ']                               
                                                                                                  
 block6g_bn (BatchNormalization  (None, 2, 2, 1248)  4992        ['block6g_dwconv2[0][0]']        
 )                                                                                                
                                                                                                  
 block6g_activation (Activation  (None, 2, 2, 1248)  0           ['block6g_bn[0][0]']             
 )                                                                                                
                                                                                                  
 block6g_se_squeeze (GlobalAver  (None, 1248)        0           ['block6g_activation[0][0]']     
 agePooling2D)                                                                                    
                                                                                                  
 block6g_se_reshape (Reshape)   (None, 1, 1, 1248)   0           ['block6g_se_squeeze[0][0]']     
                                                                                                  
 block6g_se_reduce (Conv2D)     (None, 1, 1, 52)     64948       ['block6g_se_reshape[0][0]']     
                                                                                                  
 block6g_se_expand (Conv2D)     (None, 1, 1, 1248)   66144       ['block6g_se_reduce[0][0]']      
                                                                                                  
 block6g_se_excite (Multiply)   (None, 2, 2, 1248)   0           ['block6g_activation[0][0]',     
                                                                  'block6g_se_expand[0][0]']      
                                                                                                  
 block6g_project_conv (Conv2D)  (None, 2, 2, 208)    259584      ['block6g_se_excite[0][0]']      
                                                                                                  
 block6g_project_bn (BatchNorma  (None, 2, 2, 208)   832         ['block6g_project_conv[0][0]']   
 lization)                                                                                        
                                                                                                  
 block6g_drop (Dropout)         (None, 2, 2, 208)    0           ['block6g_project_bn[0][0]']     
                                                                                                  
 block6g_add (Add)              (None, 2, 2, 208)    0           ['block6g_drop[0][0]',           
                                                                  'block6f_add[0][0]']            
                                                                                                  
 block6h_expand_conv (Conv2D)   (None, 2, 2, 1248)   259584      ['block6g_add[0][0]']            
                                                                                                  
 block6h_expand_bn (BatchNormal  (None, 2, 2, 1248)  4992        ['block6h_expand_conv[0][0]']    
 ization)                                                                                         
                                                                                                  
 block6h_expand_activation (Act  (None, 2, 2, 1248)  0           ['block6h_expand_bn[0][0]']      
 ivation)                                                                                         
                                                                                                  
 block6h_dwconv2 (DepthwiseConv  (None, 2, 2, 1248)  11232       ['block6h_expand_activation[0][0]
 2D)                                                             ']                               
                                                                                                  
 block6h_bn (BatchNormalization  (None, 2, 2, 1248)  4992        ['block6h_dwconv2[0][0]']        
 )                                                                                                
                                                                                                  
 block6h_activation (Activation  (None, 2, 2, 1248)  0           ['block6h_bn[0][0]']             
 )                                                                                                
                                                                                                  
 block6h_se_squeeze (GlobalAver  (None, 1248)        0           ['block6h_activation[0][0]']     
 agePooling2D)                                                                                    
                                                                                                  
 block6h_se_reshape (Reshape)   (None, 1, 1, 1248)   0           ['block6h_se_squeeze[0][0]']     
                                                                                                  
 block6h_se_reduce (Conv2D)     (None, 1, 1, 52)     64948       ['block6h_se_reshape[0][0]']     
                                                                                                  
 block6h_se_expand (Conv2D)     (None, 1, 1, 1248)   66144       ['block6h_se_reduce[0][0]']      
                                                                                                  
 block6h_se_excite (Multiply)   (None, 2, 2, 1248)   0           ['block6h_activation[0][0]',     
                                                                  'block6h_se_expand[0][0]']      
                                                                                                  
 block6h_project_conv (Conv2D)  (None, 2, 2, 208)    259584      ['block6h_se_excite[0][0]']      
                                                                                                  
 block6h_project_bn (BatchNorma  (None, 2, 2, 208)   832         ['block6h_project_conv[0][0]']   
 lization)                                                                                        
                                                                                                  
 block6h_drop (Dropout)         (None, 2, 2, 208)    0           ['block6h_project_bn[0][0]']     
                                                                                                  
 block6h_add (Add)              (None, 2, 2, 208)    0           ['block6h_drop[0][0]',           
                                                                  'block6g_add[0][0]']            
                                                                                                  
 block6i_expand_conv (Conv2D)   (None, 2, 2, 1248)   259584      ['block6h_add[0][0]']            
                                                                                                  
 block6i_expand_bn (BatchNormal  (None, 2, 2, 1248)  4992        ['block6i_expand_conv[0][0]']    
 ization)                                                                                         
                                                                                                  
 block6i_expand_activation (Act  (None, 2, 2, 1248)  0           ['block6i_expand_bn[0][0]']      
 ivation)                                                                                         
                                                                                                  
 block6i_dwconv2 (DepthwiseConv  (None, 2, 2, 1248)  11232       ['block6i_expand_activation[0][0]
 2D)                                                             ']                               
                                                                                                  
 block6i_bn (BatchNormalization  (None, 2, 2, 1248)  4992        ['block6i_dwconv2[0][0]']        
 )                                                                                                
                                                                                                  
 block6i_activation (Activation  (None, 2, 2, 1248)  0           ['block6i_bn[0][0]']             
 )                                                                                                
                                                                                                  
 block6i_se_squeeze (GlobalAver  (None, 1248)        0           ['block6i_activation[0][0]']     
 agePooling2D)                                                                                    
                                                                                                  
 block6i_se_reshape (Reshape)   (None, 1, 1, 1248)   0           ['block6i_se_squeeze[0][0]']     
                                                                                                  
 block6i_se_reduce (Conv2D)     (None, 1, 1, 52)     64948       ['block6i_se_reshape[0][0]']     
                                                                                                  
 block6i_se_expand (Conv2D)     (None, 1, 1, 1248)   66144       ['block6i_se_reduce[0][0]']      
                                                                                                  
 block6i_se_excite (Multiply)   (None, 2, 2, 1248)   0           ['block6i_activation[0][0]',     
                                                                  'block6i_se_expand[0][0]']      
                                                                                                  
 block6i_project_conv (Conv2D)  (None, 2, 2, 208)    259584      ['block6i_se_excite[0][0]']      
                                                                                                  
 block6i_project_bn (BatchNorma  (None, 2, 2, 208)   832         ['block6i_project_conv[0][0]']   
 lization)                                                                                        
                                                                                                  
 block6i_drop (Dropout)         (None, 2, 2, 208)    0           ['block6i_project_bn[0][0]']     
                                                                                                  
 block6i_add (Add)              (None, 2, 2, 208)    0           ['block6i_drop[0][0]',           
                                                                  'block6h_add[0][0]']            
                                                                                                  
 block6j_expand_conv (Conv2D)   (None, 2, 2, 1248)   259584      ['block6i_add[0][0]']            
                                                                                                  
 block6j_expand_bn (BatchNormal  (None, 2, 2, 1248)  4992        ['block6j_expand_conv[0][0]']    
 ization)                                                                                         
                                                                                                  
 block6j_expand_activation (Act  (None, 2, 2, 1248)  0           ['block6j_expand_bn[0][0]']      
 ivation)                                                                                         
                                                                                                  
 block6j_dwconv2 (DepthwiseConv  (None, 2, 2, 1248)  11232       ['block6j_expand_activation[0][0]
 2D)                                                             ']                               
                                                                                                  
 block6j_bn (BatchNormalization  (None, 2, 2, 1248)  4992        ['block6j_dwconv2[0][0]']        
 )                                                                                                
                                                                                                  
 block6j_activation (Activation  (None, 2, 2, 1248)  0           ['block6j_bn[0][0]']             
 )                                                                                                
                                                                                                  
 block6j_se_squeeze (GlobalAver  (None, 1248)        0           ['block6j_activation[0][0]']     
 agePooling2D)                                                                                    
                                                                                                  
 block6j_se_reshape (Reshape)   (None, 1, 1, 1248)   0           ['block6j_se_squeeze[0][0]']     
                                                                                                  
 block6j_se_reduce (Conv2D)     (None, 1, 1, 52)     64948       ['block6j_se_reshape[0][0]']     
                                                                                                  
 block6j_se_expand (Conv2D)     (None, 1, 1, 1248)   66144       ['block6j_se_reduce[0][0]']      
                                                                                                  
 block6j_se_excite (Multiply)   (None, 2, 2, 1248)   0           ['block6j_activation[0][0]',     
                                                                  'block6j_se_expand[0][0]']      
                                                                                                  
 block6j_project_conv (Conv2D)  (None, 2, 2, 208)    259584      ['block6j_se_excite[0][0]']      
                                                                                                  
 block6j_project_bn (BatchNorma  (None, 2, 2, 208)   832         ['block6j_project_conv[0][0]']   
 lization)                                                                                        
                                                                                                  
 block6j_drop (Dropout)         (None, 2, 2, 208)    0           ['block6j_project_bn[0][0]']     
                                                                                                  
 block6j_add (Add)              (None, 2, 2, 208)    0           ['block6j_drop[0][0]',           
                                                                  'block6i_add[0][0]']            
                                                                                                  
 top_conv (Conv2D)              (None, 2, 2, 1408)   292864      ['block6j_add[0][0]']            
                                                                                                  
 top_bn (BatchNormalization)    (None, 2, 2, 1408)   5632        ['top_conv[0][0]']               
                                                                                                  
 top_activation (Activation)    (None, 2, 2, 1408)   0           ['top_bn[0][0]']                 
                                                                                                  
==================================================================================================
Total params: 8,769,374
Trainable params: 8,687,086
Non-trainable params: 82,288
__________________________________________________________________________________________________

Model Building¶

We have imported the EfficientNet Model up to layer 'block5f_expand_activation', as this has shown the best performance compared to other layers (discussed below). The EfficientNet layers will be frozen, so the only trainable layers will be those that we add ourselves. After flattening the input from 'block5f_expand_activation', we will add the same architecture we did earlier to the VGG16 and ResNet v2 models, namely 2 dense layers, followed by a Dropout layer, another dense layer, and BatchNormalization. We will end with a softmax classifier.

In [74]:
transfer_layer_EfficientNet = EfficientNet.get_layer('block5f_expand_activation')
EfficientNet.trainable = False

# Flatten the input
x = Flatten()(transfer_layer_EfficientNet.output)

# Dense layers
x = Dense(256, activation='relu')(x)
x = Dense(128, activation='relu')(x)
x = Dropout(0.2)(x)
x = Dense(64, activation='relu')(x)
x = BatchNormalization()(x)

# Classifier
pred = Dense(4, activation='softmax')(x)

# Initialize the model
model_5 = Model(EfficientNet.input, pred)

Compiling and Training the Model¶

In [75]:
# Creating a checkpoint which saves model weights from the best epoch
checkpoint = ModelCheckpoint('./model_5.h5', monitor='val_accuracy', verbose=1, save_best_only=True, mode='auto')

# Initiates early stopping if validation loss does not continue to improve
early_stopping = EarlyStopping(monitor = 'val_loss',
                              min_delta = 0,
                              patience = 12,
                              verbose = 1,
                              restore_best_weights = True)

# Initiates reduced learning rate if validation loss does not continue to improve
reduce_learningrate = ReduceLROnPlateau(monitor = 'val_loss',
                                        factor = 0.2,
                                        patience = 3,
                                        verbose = 1,
                                        min_delta = 0.0001)

callbacks_list = [checkpoint, early_stopping, reduce_learningrate]
In [76]:
# Compiling model with optimizer set to Adam, loss set to categorical_crossentropy, and metrics set to accuracy
model_5.compile(optimizer = Adam(learning_rate = 0.001), loss = 'categorical_crossentropy', metrics = ['accuracy'])
In [77]:
# Fitting model with epochs set to 100
history_5 = model_5.fit(train_set_rgb, validation_data = val_set_rgb, epochs = 100, callbacks = callbacks_list)
Epoch 1/100
472/473 [============================>.] - ETA: 0s - loss: 1.4247 - accuracy: 0.2577
Epoch 1: val_accuracy improved from -inf to 0.22885, saving model to ./model_5.h5
473/473 [==============================] - 34s 65ms/step - loss: 1.4246 - accuracy: 0.2580 - val_loss: 1.3986 - val_accuracy: 0.2289 - lr: 0.0010
Epoch 2/100
473/473 [==============================] - ETA: 0s - loss: 1.3988 - accuracy: 0.2626
Epoch 2: val_accuracy did not improve from 0.22885
473/473 [==============================] - 28s 59ms/step - loss: 1.3988 - accuracy: 0.2626 - val_loss: 1.4146 - val_accuracy: 0.2289 - lr: 0.0010
Epoch 3/100
473/473 [==============================] - ETA: 0s - loss: 1.3956 - accuracy: 0.2587
Epoch 3: val_accuracy improved from 0.22885 to 0.24432, saving model to ./model_5.h5
473/473 [==============================] - 30s 62ms/step - loss: 1.3956 - accuracy: 0.2587 - val_loss: 1.4337 - val_accuracy: 0.2443 - lr: 0.0010
Epoch 4/100
472/473 [============================>.] - ETA: 0s - loss: 1.3905 - accuracy: 0.2629
Epoch 4: val_accuracy did not improve from 0.24432
473/473 [==============================] - 28s 59ms/step - loss: 1.3906 - accuracy: 0.2629 - val_loss: 1.3707 - val_accuracy: 0.2443 - lr: 0.0010
Epoch 5/100
473/473 [==============================] - ETA: 0s - loss: 1.3949 - accuracy: 0.2601
Epoch 5: val_accuracy did not improve from 0.24432
473/473 [==============================] - 29s 61ms/step - loss: 1.3949 - accuracy: 0.2601 - val_loss: 1.3815 - val_accuracy: 0.2289 - lr: 0.0010
Epoch 6/100
472/473 [============================>.] - ETA: 0s - loss: 1.3943 - accuracy: 0.2532
Epoch 6: val_accuracy did not improve from 0.24432
473/473 [==============================] - 27s 57ms/step - loss: 1.3943 - accuracy: 0.2531 - val_loss: 1.4059 - val_accuracy: 0.2289 - lr: 0.0010
Epoch 7/100
473/473 [==============================] - ETA: 0s - loss: 1.3907 - accuracy: 0.2586
Epoch 7: val_accuracy did not improve from 0.24432

Epoch 7: ReduceLROnPlateau reducing learning rate to 0.00020000000949949026.
473/473 [==============================] - 28s 59ms/step - loss: 1.3907 - accuracy: 0.2586 - val_loss: 1.3807 - val_accuracy: 0.2289 - lr: 0.0010
Epoch 8/100
472/473 [============================>.] - ETA: 0s - loss: 1.3847 - accuracy: 0.2581
Epoch 8: val_accuracy did not improve from 0.24432
473/473 [==============================] - 29s 61ms/step - loss: 1.3847 - accuracy: 0.2577 - val_loss: 1.3774 - val_accuracy: 0.2289 - lr: 2.0000e-04
Epoch 9/100
473/473 [==============================] - ETA: 0s - loss: 1.3828 - accuracy: 0.2626
Epoch 9: val_accuracy did not improve from 0.24432
473/473 [==============================] - 28s 59ms/step - loss: 1.3828 - accuracy: 0.2626 - val_loss: 1.3743 - val_accuracy: 0.2289 - lr: 2.0000e-04
Epoch 10/100
473/473 [==============================] - ETA: 0s - loss: 1.3832 - accuracy: 0.2664
Epoch 10: val_accuracy did not improve from 0.24432

Epoch 10: ReduceLROnPlateau reducing learning rate to 4.0000001899898055e-05.
473/473 [==============================] - 29s 60ms/step - loss: 1.3832 - accuracy: 0.2664 - val_loss: 1.3724 - val_accuracy: 0.2289 - lr: 2.0000e-04
Epoch 11/100
472/473 [============================>.] - ETA: 0s - loss: 1.3828 - accuracy: 0.2629
Epoch 11: val_accuracy did not improve from 0.24432
473/473 [==============================] - 28s 60ms/step - loss: 1.3828 - accuracy: 0.2630 - val_loss: 1.3758 - val_accuracy: 0.2289 - lr: 4.0000e-05
Epoch 12/100
473/473 [==============================] - ETA: 0s - loss: 1.3824 - accuracy: 0.2657
Epoch 12: val_accuracy did not improve from 0.24432
473/473 [==============================] - 29s 61ms/step - loss: 1.3824 - accuracy: 0.2657 - val_loss: 1.3776 - val_accuracy: 0.2289 - lr: 4.0000e-05
Epoch 13/100
473/473 [==============================] - ETA: 0s - loss: 1.3820 - accuracy: 0.2657
Epoch 13: val_accuracy did not improve from 0.24432

Epoch 13: ReduceLROnPlateau reducing learning rate to 8.000000525498762e-06.
473/473 [==============================] - 30s 63ms/step - loss: 1.3820 - accuracy: 0.2657 - val_loss: 1.3756 - val_accuracy: 0.2289 - lr: 4.0000e-05
Epoch 14/100
473/473 [==============================] - ETA: 0s - loss: 1.3813 - accuracy: 0.2699
Epoch 14: val_accuracy did not improve from 0.24432
473/473 [==============================] - 27s 56ms/step - loss: 1.3813 - accuracy: 0.2699 - val_loss: 1.3776 - val_accuracy: 0.2289 - lr: 8.0000e-06
Epoch 15/100
473/473 [==============================] - ETA: 0s - loss: 1.3815 - accuracy: 0.2655
Epoch 15: val_accuracy did not improve from 0.24432
473/473 [==============================] - 28s 58ms/step - loss: 1.3815 - accuracy: 0.2655 - val_loss: 1.3778 - val_accuracy: 0.2289 - lr: 8.0000e-06
Epoch 16/100
473/473 [==============================] - ETA: 0s - loss: 1.3817 - accuracy: 0.2652
Epoch 16: val_accuracy did not improve from 0.24432
Restoring model weights from the end of the best epoch: 4.

Epoch 16: ReduceLROnPlateau reducing learning rate to 1.6000001778593287e-06.
473/473 [==============================] - 29s 61ms/step - loss: 1.3817 - accuracy: 0.2652 - val_loss: 1.3777 - val_accuracy: 0.2289 - lr: 8.0000e-06
Epoch 16: early stopping
In [78]:
# Plotting the accuracies

plt.figure(figsize = (10, 5))
plt.plot(history_5.history['accuracy'])
plt.plot(history_5.history['val_accuracy'])
plt.title('Accuracy - EfficientNet Model')
plt.ylabel('Accuracy')
plt.xlabel('Epoch')
plt.legend(['Training', 'Validation'], loc='center right')
plt.show()
In [79]:
# Plotting the losses

plt.figure(figsize = (10, 5))
plt.plot(history_5.history['loss'])
plt.plot(history_5.history['val_loss'])
plt.title('Loss - EfficientNet Model')
plt.ylabel('loss')
plt.xlabel('epoch')
plt.legend(['Training', 'Validation'], loc='upper right')
plt.show()

Evaluating the EfficientNet Model¶

In [80]:
# Evaluating the model's performance on the test set
accuracy = model_5.evaluate(test_set_rgb)
4/4 [==============================] - 0s 58ms/step - loss: 1.3913 - accuracy: 0.2500

Observations and Insights:
As imported and modified, this model performs poorly. After just 4 epochs (the 'best' epoch), training accuracy stands at 0.26 and validation accuracy is 0.24. Training and validation accuracy are almost immediately horizontal. Loss declines a bit before leveling off. With test accuracy coming in at 0.25, it makes the model no better than random guessing. We could build a model that classifies every single image as 'happy', and with our evenly distributed test set, it would produce the same 0.25 accuracy as our EfficientNet model.

Again, it was difficult to select a 'best' layer from which to import the EfficientNet model. A history of alternative models is below.

Train Loss Train Accuracy Val Loss Val Accuracy
EfficientNet block5f_expand_activation (selected) 1.39 0.26 1.37 0.24
EfficientNet block6e_expand_activation 1.53 0.25 1.45 0.22
EfficientNet block4a_expand_activation 1.42 0.25 1.42 0.21
EfficientNet block3c_expand_activation 1.47 0.26 1.44 0.22


Overall Observations and Insights on Transfer Learning Models:

  • As outlined above, the performance of these transfer learning models varied greatly. While the VGG16 model performed admirably (see table below), the ResNet v2 and EfficientNet models left much to be desired in terms of stability and performance.
  • On the whole, none of the transfer learning models performed better than our baseline and 2nd generation models, which was surprising.
  • Model complexity seems to have played a role in performance, as the VGG16 model has a much less complex architecture than both the ResNet v2 and the EfficientNet models. Perhaps overly-complex models trained on millions of large, color images do not perform as well on smaller, black and white images from just 4 classes.
    • VGG16, with 14.7 million parameters, is a fairly straightforward architecture, with just 19 layers from the input layer to the max import layer, 'block5_pool'.
    • ResNet v2, with 42.7 million parameters, is a much more complex architecture, with a whopping 345 layers from the input layer to the max import layer, 'conv5_block3_out'.
    • EfficientNet, with 'just' 8.8 million parameters, contains 349 layers from the input layer to the max import layer, 'top_activation'.
  • As evidenced by the table below, it would appear that the unsatisfactory performance of the transfer learning models may have more to do with their complexity than the fact that they require a colormode of RGB. The baseline and 2nd generation RGB models both performed just as well as the VGG16 model. It would seem that the downfall of ResNet v2 and EfficientNet was their complex architecture. Quite simply, the simpler models performed better. In fact, the highest performing model so far, the 2nd generation grayscale model (Model 2.1), has the smallest number of parameters.
  • Perhaps a sweet spot exists somewhere between the simplicity of our 2nd generation grayscale model and the much more complex transfer learning models we have explored thus far. If it is possible to increase the complexity of our 2nd generation grayscale model while keeping the overall complexity from ballooning too far in the direction of the transfer learning models, we may find ourselves a successful model.
Parameters Train Loss Train Accuracy Val Loss Val Accuracy Test Accuracy
Model 1.1: Baseline Grayscale 605,060 0.68 0.72 0.78 0.68 0.65
Model 1.2: Baseline RGB 605,572 0.68 0.72 0.78 0.68 0.63
Model 2.1: 2nd Gen Grayscale 455,780 0.54 0.78 0.74 0.71 0.69
Model 2.2: 2nd Gen RGB 457,828 0.59 0.76 0.72 0.71 0.68
Model 3: VGG16 14,714,688 0.71 0.72 0.80 0.67 0.66
Model 4: ResNet V2 42,658,176 1.43 0.26 1.35 0.36 0.28
Model 5: EfficientNet 8,769,374 1.39 0.26 1.37 0.24 0.25


Milestone 1¶

Model 6: Complex Neural Network Architecture¶

As previewed above, it is time to expand our 2nd generation grayscale model to see if we can improve performance. Grayscale slightly outperformed RGB in our first two models, so we will leave RGB behind and proceed with color_mode set to grayscale.

Creating our Data Loaders¶

As we are proceeding with a colormode set to grayscale, we will create new data loaders for our more complex CNN, Model 6. As our data augmentation takes place when we instantiate an ImageDataGenerator object, it is convenient to create data loaders specific to our new model so we can easily finetune our hyperparameters as needed. The ImageDataGenerators below include the parameters of the final Milestone 1 model, the highest performing CNN thus far. They were chosen after exhaustive finetuning of the model, as discussed later.

  • Batch size is set to 32. The model was tested with batch sizes of 16, 32, 45, 64, and 128. A batch size of 32 performed the best. The smaller the batch size, the longer training took. The larger the batch size, the faster the training process, though the accuracy and loss bounced around significantly, offsetting the increased speed.
  • horizontal_flip is set to 'True'. As some faces in the images face left while others face right or straight ahead, flipping the training images improves our model's ability to learn that horizontal orientation should not affect the eventual classification.
  • rescale is equal to 1./255, which normalizes the pixel values to a number between 0 and 1. This helps to prevent vanishing and exploding gradients in our network by keeping the numbers small and manageable.
  • brightness_range is set to '0.7,1.3'. A setting of 1 results in images remaining unchanged. As the number approaches zero, the images become darker. As the number approaches 2, the images become lighter. As many of the images are already very dark or very light, limiting this setting to a relatively small range around 1 will help our model learn to deal with varying pixel values without rendering some images completely unusable.
  • rotation_range is set to 25, meaning the images may randomly be rotated up to 25 degrees. Similar to flipping the images horizontally, this rotation will help the model learn that the angle of a face is not an important feature.
  • Additional data augmentation methods were attempted and later removed after failing to significantly improve model performance. Among those tested were width_shift_range, height_shift_range, shear_range, and zoom_range.
In [81]:
batch_size  = 32

# Creating ImageDataGenerator objects for grayscale colormode 
datagen_train_grayscale = ImageDataGenerator(horizontal_flip = True,
                                             rescale = 1./255, 
                                             brightness_range = (0.7,1.3), 
                                             rotation_range=25)

datagen_validation_grayscale = ImageDataGenerator(horizontal_flip = True,
                                                  rescale = 1./255, 
                                                  brightness_range = (0.7,1.3), 
                                                  rotation_range=25)

datagen_test_grayscale = ImageDataGenerator(horizontal_flip = True,
                                            rescale = 1./255, 
                                            brightness_range = (0.7,1.3), 
                                            rotation_range=25)


# Creating train, validation, and test sets for grayscale colormode

print("Grayscale Images")

train_set_grayscale = datagen_train_grayscale.flow_from_directory(dir_train,
                        target_size = (img_size, img_size),
                        color_mode = "grayscale",
                        batch_size = batch_size,
                        class_mode = 'categorical',
                        classes = ['happy', 'sad', 'neutral', 'surprise'],
                        seed = 42,
                        shuffle = True)

val_set_grayscale = datagen_validation_grayscale.flow_from_directory(dir_validation,
                        target_size = (img_size, img_size),
                        color_mode = "grayscale",
                        batch_size = batch_size,
                        class_mode = 'categorical',
                        classes = ['happy', 'sad', 'neutral', 'surprise'],
                        seed = 42,
                        shuffle = True)

test_set_grayscale = datagen_test_grayscale.flow_from_directory(dir_test,
                        target_size = (img_size, img_size),
                        color_mode = "grayscale",
                        batch_size = batch_size,
                        class_mode = 'categorical',
                        classes = ['happy', 'sad', 'neutral', 'surprise'],
                        seed = 42,
                        shuffle = True)
Grayscale Images
Found 15109 images belonging to 4 classes.
Found 4977 images belonging to 4 classes.
Found 128 images belonging to 4 classes.

Model Building¶

The structure of the Milestone 1 model (Model 6) is below. Many configurations were tested, and the following architecture led to the best performance.

  • The model begins with an input layer accepting an input shape of '48,48,1', given that our color_mode has been set to grayscale.
  • There are 5 convolutional blocks with relu activation. Each block contains BatchNormalization, LeakyReLU, and MaxPooling layers. The first, second, and fourth blocks include a layer of GaussianNoise, while the third and fifth layers each include a Dropout layer.
  • The output of the fifth convolutional block is then flattened, and fed into 2 dense layers which include additional BatchNormalization and Dropout layers.
  • The architecture is completed with a softmax classifier, as this model is designed for multi-class classification. Test images will be classified as either happy, sad, neutral, or surprise.
  • The model contains 2.1 million parameters, making it more complex than our 2nd generation grayscale model, but not as complex as the transfer learning models, whose complexity appeared to hurt their performance.
In [82]:
# Creating a Sequential model
model_6 = Sequential()
 
# Convolutional Block #1
model_6.add(Conv2D(64, (3, 3), input_shape = (48, 48, 1), activation='relu', padding = 'same'))
model_6.add(BatchNormalization())
model_6.add(LeakyReLU(alpha = 0.1))
model_6.add(MaxPooling2D(2, 2))
model_6.add(GaussianNoise(0.1))

# Convolutional Block #2
model_6.add(Conv2D(128, (3, 3), activation='relu', padding = 'same'))
model_6.add(BatchNormalization())
model_6.add(LeakyReLU(alpha = 0.1))
model_6.add(MaxPooling2D(2, 2))
model_6.add(GaussianNoise(0.1))

# Convolutional Block #3
model_6.add(Conv2D(512, (2, 2), activation='relu', padding = 'same'))
model_6.add(BatchNormalization())
model_6.add(LeakyReLU(alpha = 0.1))
model_6.add(MaxPooling2D(2, 2))
model_6.add(Dropout(0.1))

# Convolutional Block #4
model_6.add(Conv2D(512, (2, 2), activation='relu', padding = 'same'))
model_6.add(BatchNormalization())
model_6.add(LeakyReLU(alpha = 0.1))
model_6.add(MaxPooling2D(2, 2))
model_6.add(GaussianNoise(0.1))

# Convolutional Block #5
model_6.add(Conv2D(256, (2, 2), activation='relu', padding = 'same'))
model_6.add(BatchNormalization())
model_6.add(LeakyReLU(alpha = 0.1))
model_6.add(MaxPooling2D(2, 2))
model_6.add(Dropout(0.1))

# Flatten layer
model_6.add(Flatten())

# Dense layers
model_6.add(Dense(256, activation = 'relu'))
model_6.add(BatchNormalization())
model_6.add(Dropout(0.1))

model_6.add(Dense(512, activation = 'relu'))
model_6.add(BatchNormalization())
model_6.add(Dropout(0.05))

# Classifier
model_6.add(Dense(4, activation = 'softmax'))

model_6.summary()
Metal device set to: Apple M1 Pro
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 conv2d (Conv2D)             (None, 48, 48, 64)        640       
                                                                 
 batch_normalization (BatchN  (None, 48, 48, 64)       256       
 ormalization)                                                   
                                                                 
 leaky_re_lu (LeakyReLU)     (None, 48, 48, 64)        0         
                                                                 
 max_pooling2d (MaxPooling2D  (None, 24, 24, 64)       0         
 )                                                               
                                                                 
 gaussian_noise (GaussianNoi  (None, 24, 24, 64)       0         
 se)                                                             
                                                                 
 conv2d_1 (Conv2D)           (None, 24, 24, 128)       73856     
                                                                 
 batch_normalization_1 (Batc  (None, 24, 24, 128)      512       
 hNormalization)                                                 
                                                                 
 leaky_re_lu_1 (LeakyReLU)   (None, 24, 24, 128)       0         
                                                                 
 max_pooling2d_1 (MaxPooling  (None, 12, 12, 128)      0         
 2D)                                                             
                                                                 
 gaussian_noise_1 (GaussianN  (None, 12, 12, 128)      0         
 oise)                                                           
                                                                 
 conv2d_2 (Conv2D)           (None, 12, 12, 512)       262656    
                                                                 
 batch_normalization_2 (Batc  (None, 12, 12, 512)      2048      
 hNormalization)                                                 
                                                                 
 leaky_re_lu_2 (LeakyReLU)   (None, 12, 12, 512)       0         
                                                                 
 max_pooling2d_2 (MaxPooling  (None, 6, 6, 512)        0         
 2D)                                                             
                                                                 
 dropout (Dropout)           (None, 6, 6, 512)         0         
                                                                 
 conv2d_3 (Conv2D)           (None, 6, 6, 512)         1049088   
                                                                 
 batch_normalization_3 (Batc  (None, 6, 6, 512)        2048      
 hNormalization)                                                 
                                                                 
 leaky_re_lu_3 (LeakyReLU)   (None, 6, 6, 512)         0         
                                                                 
 max_pooling2d_3 (MaxPooling  (None, 3, 3, 512)        0         
 2D)                                                             
                                                                 
 gaussian_noise_2 (GaussianN  (None, 3, 3, 512)        0         
 oise)                                                           
                                                                 
 conv2d_4 (Conv2D)           (None, 3, 3, 256)         524544    
                                                                 
 batch_normalization_4 (Batc  (None, 3, 3, 256)        1024      
 hNormalization)                                                 
                                                                 
 leaky_re_lu_4 (LeakyReLU)   (None, 3, 3, 256)         0         
                                                                 
 max_pooling2d_4 (MaxPooling  (None, 1, 1, 256)        0         
 2D)                                                             
                                                                 
 dropout_1 (Dropout)         (None, 1, 1, 256)         0         
                                                                 
 flatten (Flatten)           (None, 256)               0         
                                                                 
 dense (Dense)               (None, 256)               65792     
                                                                 
 batch_normalization_5 (Batc  (None, 256)              1024      
 hNormalization)                                                 
                                                                 
 dropout_2 (Dropout)         (None, 256)               0         
                                                                 
 dense_1 (Dense)             (None, 512)               131584    
                                                                 
 batch_normalization_6 (Batc  (None, 512)              2048      
 hNormalization)                                                 
                                                                 
 dropout_3 (Dropout)         (None, 512)               0         
                                                                 
 dense_2 (Dense)             (None, 4)                 2052      
                                                                 
=================================================================
Total params: 2,119,172
Trainable params: 2,114,692
Non-trainable params: 4,480
_________________________________________________________________

Compiling and Training the Model¶

In [83]:
# Creating a checkpoint which saves model weights from the best epoch
checkpoint = ModelCheckpoint('./model_6.h5', monitor='val_accuracy', verbose=1, save_best_only=True, mode='auto')

# Initiates early stopping if validation loss does not continue to improve
early_stopping = EarlyStopping(monitor = 'val_loss',
                              min_delta = 0,
                              patience = 10,
                              verbose = 1,
                              restore_best_weights = True)

# Initiates reduced learning rate if validation loss does not continue to improve
reduce_learningrate = ReduceLROnPlateau(monitor = 'val_loss',
                                        factor = 0.2,
                                        patience = 3,
                                        verbose = 1,
                                        min_delta = 0.0001)

callbacks_list = [checkpoint, early_stopping, reduce_learningrate]
In [84]:
# Compiling model with optimizer set to Adam, loss set to categorical_crossentropy, and metrics set to accuracy
model_6.compile(optimizer = 'Adam', loss = 'categorical_crossentropy', metrics = ['accuracy'])
In [85]:
# Fitting model with epochs set to 100
history_6 = model_6.fit(train_set_grayscale, validation_data = val_set_grayscale, epochs = 100, callbacks = callbacks_list)
Epoch 1/100
473/473 [==============================] - ETA: 0s - loss: 1.4412 - accuracy: 0.3567
Epoch 1: val_accuracy improved from -inf to 0.36649, saving model to ./model_6.h5
473/473 [==============================] - 43s 89ms/step - loss: 1.4412 - accuracy: 0.3567 - val_loss: 1.4990 - val_accuracy: 0.3665 - lr: 0.0010
Epoch 2/100
473/473 [==============================] - ETA: 0s - loss: 1.1413 - accuracy: 0.4874
Epoch 2: val_accuracy improved from 0.36649 to 0.52059, saving model to ./model_6.h5
473/473 [==============================] - 42s 89ms/step - loss: 1.1413 - accuracy: 0.4874 - val_loss: 1.0787 - val_accuracy: 0.5206 - lr: 0.0010
Epoch 3/100
473/473 [==============================] - ETA: 0s - loss: 0.9954 - accuracy: 0.5729
Epoch 3: val_accuracy did not improve from 0.52059
473/473 [==============================] - 41s 87ms/step - loss: 0.9954 - accuracy: 0.5729 - val_loss: 1.4114 - val_accuracy: 0.3769 - lr: 0.0010
Epoch 4/100
473/473 [==============================] - ETA: 0s - loss: 0.9205 - accuracy: 0.6166
Epoch 4: val_accuracy improved from 0.52059 to 0.57705, saving model to ./model_6.h5
473/473 [==============================] - 41s 87ms/step - loss: 0.9205 - accuracy: 0.6166 - val_loss: 1.0254 - val_accuracy: 0.5771 - lr: 0.0010
Epoch 5/100
473/473 [==============================] - ETA: 0s - loss: 0.8666 - accuracy: 0.6415
Epoch 5: val_accuracy improved from 0.57705 to 0.60438, saving model to ./model_6.h5
473/473 [==============================] - 1620s 3s/step - loss: 0.8666 - accuracy: 0.6415 - val_loss: 0.9523 - val_accuracy: 0.6044 - lr: 0.0010
Epoch 6/100
473/473 [==============================] - ETA: 0s - loss: 0.8290 - accuracy: 0.6543
Epoch 6: val_accuracy improved from 0.60438 to 0.65803, saving model to ./model_6.h5
473/473 [==============================] - 151s 319ms/step - loss: 0.8290 - accuracy: 0.6543 - val_loss: 0.8368 - val_accuracy: 0.6580 - lr: 0.0010
Epoch 7/100
473/473 [==============================] - ETA: 0s - loss: 0.7970 - accuracy: 0.6700
Epoch 7: val_accuracy did not improve from 0.65803
473/473 [==============================] - 43s 90ms/step - loss: 0.7970 - accuracy: 0.6700 - val_loss: 1.0110 - val_accuracy: 0.5596 - lr: 0.0010
Epoch 8/100
473/473 [==============================] - ETA: 0s - loss: 0.7768 - accuracy: 0.6795
Epoch 8: val_accuracy improved from 0.65803 to 0.69118, saving model to ./model_6.h5
473/473 [==============================] - 42s 89ms/step - loss: 0.7768 - accuracy: 0.6795 - val_loss: 0.8693 - val_accuracy: 0.6912 - lr: 0.0010
Epoch 9/100
473/473 [==============================] - ETA: 0s - loss: 0.7577 - accuracy: 0.6926
Epoch 9: val_accuracy did not improve from 0.69118

Epoch 9: ReduceLROnPlateau reducing learning rate to 0.00020000000949949026.
473/473 [==============================] - 43s 91ms/step - loss: 0.7577 - accuracy: 0.6926 - val_loss: 0.8368 - val_accuracy: 0.6631 - lr: 0.0010
Epoch 10/100
473/473 [==============================] - ETA: 0s - loss: 0.6667 - accuracy: 0.7288
Epoch 10: val_accuracy improved from 0.69118 to 0.73478, saving model to ./model_6.h5
473/473 [==============================] - 42s 90ms/step - loss: 0.6667 - accuracy: 0.7288 - val_loss: 0.6538 - val_accuracy: 0.7348 - lr: 2.0000e-04
Epoch 11/100
473/473 [==============================] - ETA: 0s - loss: 0.6419 - accuracy: 0.7415
Epoch 11: val_accuracy improved from 0.73478 to 0.73599, saving model to ./model_6.h5
473/473 [==============================] - 43s 90ms/step - loss: 0.6419 - accuracy: 0.7415 - val_loss: 0.6677 - val_accuracy: 0.7360 - lr: 2.0000e-04
Epoch 12/100
473/473 [==============================] - ETA: 0s - loss: 0.6263 - accuracy: 0.7505
Epoch 12: val_accuracy did not improve from 0.73599
473/473 [==============================] - 42s 88ms/step - loss: 0.6263 - accuracy: 0.7505 - val_loss: 0.6744 - val_accuracy: 0.7245 - lr: 2.0000e-04
Epoch 13/100
473/473 [==============================] - ETA: 0s - loss: 0.6079 - accuracy: 0.7551
Epoch 13: val_accuracy did not improve from 0.73599

Epoch 13: ReduceLROnPlateau reducing learning rate to 4.0000001899898055e-05.
473/473 [==============================] - 42s 89ms/step - loss: 0.6079 - accuracy: 0.7551 - val_loss: 0.7121 - val_accuracy: 0.7099 - lr: 2.0000e-04
Epoch 14/100
473/473 [==============================] - ETA: 0s - loss: 0.5842 - accuracy: 0.7673
Epoch 14: val_accuracy improved from 0.73599 to 0.75085, saving model to ./model_6.h5
473/473 [==============================] - 42s 89ms/step - loss: 0.5842 - accuracy: 0.7673 - val_loss: 0.6154 - val_accuracy: 0.7509 - lr: 4.0000e-05
Epoch 15/100
473/473 [==============================] - ETA: 0s - loss: 0.5717 - accuracy: 0.7719
Epoch 15: val_accuracy improved from 0.75085 to 0.75186, saving model to ./model_6.h5
473/473 [==============================] - 43s 91ms/step - loss: 0.5717 - accuracy: 0.7719 - val_loss: 0.6172 - val_accuracy: 0.7519 - lr: 4.0000e-05
Epoch 16/100
473/473 [==============================] - ETA: 0s - loss: 0.5682 - accuracy: 0.7764
Epoch 16: val_accuracy improved from 0.75186 to 0.76251, saving model to ./model_6.h5
473/473 [==============================] - 42s 89ms/step - loss: 0.5682 - accuracy: 0.7764 - val_loss: 0.6131 - val_accuracy: 0.7625 - lr: 4.0000e-05
Epoch 17/100
473/473 [==============================] - ETA: 0s - loss: 0.5629 - accuracy: 0.7715
Epoch 17: val_accuracy did not improve from 0.76251
473/473 [==============================] - 44s 94ms/step - loss: 0.5629 - accuracy: 0.7715 - val_loss: 0.6038 - val_accuracy: 0.7597 - lr: 4.0000e-05
Epoch 18/100
473/473 [==============================] - ETA: 0s - loss: 0.5459 - accuracy: 0.7809
Epoch 18: val_accuracy did not improve from 0.76251
473/473 [==============================] - 42s 88ms/step - loss: 0.5459 - accuracy: 0.7809 - val_loss: 0.6133 - val_accuracy: 0.7585 - lr: 4.0000e-05
Epoch 19/100
473/473 [==============================] - ETA: 0s - loss: 0.5516 - accuracy: 0.7811
Epoch 19: val_accuracy did not improve from 0.76251
473/473 [==============================] - 43s 90ms/step - loss: 0.5516 - accuracy: 0.7811 - val_loss: 0.6163 - val_accuracy: 0.7569 - lr: 4.0000e-05
Epoch 20/100
473/473 [==============================] - ETA: 0s - loss: 0.5459 - accuracy: 0.7828
Epoch 20: val_accuracy did not improve from 0.76251

Epoch 20: ReduceLROnPlateau reducing learning rate to 8.000000525498762e-06.
473/473 [==============================] - 42s 89ms/step - loss: 0.5459 - accuracy: 0.7828 - val_loss: 0.6205 - val_accuracy: 0.7545 - lr: 4.0000e-05
Epoch 21/100
473/473 [==============================] - ETA: 0s - loss: 0.5460 - accuracy: 0.7846
Epoch 21: val_accuracy did not improve from 0.76251
473/473 [==============================] - 44s 92ms/step - loss: 0.5460 - accuracy: 0.7846 - val_loss: 0.6054 - val_accuracy: 0.7565 - lr: 8.0000e-06
Epoch 22/100
473/473 [==============================] - ETA: 0s - loss: 0.5419 - accuracy: 0.7840
Epoch 22: val_accuracy improved from 0.76251 to 0.76713, saving model to ./model_6.h5
473/473 [==============================] - 44s 93ms/step - loss: 0.5419 - accuracy: 0.7840 - val_loss: 0.6065 - val_accuracy: 0.7671 - lr: 8.0000e-06
Epoch 23/100
473/473 [==============================] - ETA: 0s - loss: 0.5367 - accuracy: 0.7843
Epoch 23: val_accuracy did not improve from 0.76713

Epoch 23: ReduceLROnPlateau reducing learning rate to 1.6000001778593287e-06.
473/473 [==============================] - 42s 90ms/step - loss: 0.5367 - accuracy: 0.7843 - val_loss: 0.6070 - val_accuracy: 0.7565 - lr: 8.0000e-06
Epoch 24/100
473/473 [==============================] - ETA: 0s - loss: 0.5346 - accuracy: 0.7888
Epoch 24: val_accuracy did not improve from 0.76713
473/473 [==============================] - 42s 89ms/step - loss: 0.5346 - accuracy: 0.7888 - val_loss: 0.6090 - val_accuracy: 0.7569 - lr: 1.6000e-06
Epoch 25/100
473/473 [==============================] - ETA: 0s - loss: 0.5327 - accuracy: 0.7844
Epoch 25: val_accuracy did not improve from 0.76713
473/473 [==============================] - 42s 89ms/step - loss: 0.5327 - accuracy: 0.7844 - val_loss: 0.6116 - val_accuracy: 0.7593 - lr: 1.6000e-06
Epoch 26/100
473/473 [==============================] - ETA: 0s - loss: 0.5433 - accuracy: 0.7871
Epoch 26: val_accuracy did not improve from 0.76713

Epoch 26: ReduceLROnPlateau reducing learning rate to 3.200000264769187e-07.
473/473 [==============================] - 42s 89ms/step - loss: 0.5433 - accuracy: 0.7871 - val_loss: 0.6168 - val_accuracy: 0.7557 - lr: 1.6000e-06
Epoch 27/100
473/473 [==============================] - ETA: 0s - loss: 0.5275 - accuracy: 0.7897
Epoch 27: val_accuracy did not improve from 0.76713
473/473 [==============================] - 41s 87ms/step - loss: 0.5275 - accuracy: 0.7897 - val_loss: 0.6019 - val_accuracy: 0.7585 - lr: 3.2000e-07
Epoch 28/100
473/473 [==============================] - ETA: 0s - loss: 0.5311 - accuracy: 0.7903
Epoch 28: val_accuracy did not improve from 0.76713
473/473 [==============================] - 49s 104ms/step - loss: 0.5311 - accuracy: 0.7903 - val_loss: 0.6096 - val_accuracy: 0.7587 - lr: 3.2000e-07
Epoch 29/100
473/473 [==============================] - ETA: 0s - loss: 0.5337 - accuracy: 0.7893
Epoch 29: val_accuracy did not improve from 0.76713
473/473 [==============================] - 43s 91ms/step - loss: 0.5337 - accuracy: 0.7893 - val_loss: 0.6000 - val_accuracy: 0.7625 - lr: 3.2000e-07
Epoch 30/100
473/473 [==============================] - ETA: 0s - loss: 0.5320 - accuracy: 0.7903
Epoch 30: val_accuracy did not improve from 0.76713
473/473 [==============================] - 41s 86ms/step - loss: 0.5320 - accuracy: 0.7903 - val_loss: 0.6133 - val_accuracy: 0.7583 - lr: 3.2000e-07
Epoch 31/100
473/473 [==============================] - ETA: 0s - loss: 0.5377 - accuracy: 0.7856
Epoch 31: val_accuracy did not improve from 0.76713
473/473 [==============================] - 42s 89ms/step - loss: 0.5377 - accuracy: 0.7856 - val_loss: 0.6099 - val_accuracy: 0.7611 - lr: 3.2000e-07
Epoch 32/100
473/473 [==============================] - ETA: 0s - loss: 0.5352 - accuracy: 0.7913
Epoch 32: val_accuracy did not improve from 0.76713

Epoch 32: ReduceLROnPlateau reducing learning rate to 6.400000529538374e-08.
473/473 [==============================] - 41s 87ms/step - loss: 0.5352 - accuracy: 0.7913 - val_loss: 0.6081 - val_accuracy: 0.7601 - lr: 3.2000e-07
Epoch 33/100
473/473 [==============================] - ETA: 0s - loss: 0.5360 - accuracy: 0.7874
Epoch 33: val_accuracy did not improve from 0.76713
473/473 [==============================] - 41s 87ms/step - loss: 0.5360 - accuracy: 0.7874 - val_loss: 0.5998 - val_accuracy: 0.7639 - lr: 6.4000e-08
Epoch 34/100
473/473 [==============================] - ETA: 0s - loss: 0.5304 - accuracy: 0.7885
Epoch 34: val_accuracy did not improve from 0.76713
473/473 [==============================] - 42s 89ms/step - loss: 0.5304 - accuracy: 0.7885 - val_loss: 0.6073 - val_accuracy: 0.7563 - lr: 6.4000e-08
Epoch 35/100
473/473 [==============================] - ETA: 0s - loss: 0.5398 - accuracy: 0.7830
Epoch 35: val_accuracy did not improve from 0.76713
473/473 [==============================] - 42s 88ms/step - loss: 0.5398 - accuracy: 0.7830 - val_loss: 0.6072 - val_accuracy: 0.7609 - lr: 6.4000e-08
Epoch 36/100
473/473 [==============================] - ETA: 0s - loss: 0.5339 - accuracy: 0.7940
Epoch 36: val_accuracy did not improve from 0.76713

Epoch 36: ReduceLROnPlateau reducing learning rate to 1.2800001059076749e-08.
473/473 [==============================] - 42s 89ms/step - loss: 0.5339 - accuracy: 0.7940 - val_loss: 0.6040 - val_accuracy: 0.7613 - lr: 6.4000e-08
Epoch 37/100
473/473 [==============================] - ETA: 0s - loss: 0.5418 - accuracy: 0.7862
Epoch 37: val_accuracy did not improve from 0.76713
473/473 [==============================] - 41s 86ms/step - loss: 0.5418 - accuracy: 0.7862 - val_loss: 0.6112 - val_accuracy: 0.7577 - lr: 1.2800e-08
Epoch 38/100
473/473 [==============================] - ETA: 0s - loss: 0.5349 - accuracy: 0.7855
Epoch 38: val_accuracy did not improve from 0.76713
473/473 [==============================] - 44s 92ms/step - loss: 0.5349 - accuracy: 0.7855 - val_loss: 0.6158 - val_accuracy: 0.7595 - lr: 1.2800e-08
Epoch 39/100
473/473 [==============================] - ETA: 0s - loss: 0.5322 - accuracy: 0.7909
Epoch 39: val_accuracy did not improve from 0.76713

Epoch 39: ReduceLROnPlateau reducing learning rate to 2.5600002118153498e-09.
473/473 [==============================] - 42s 89ms/step - loss: 0.5322 - accuracy: 0.7909 - val_loss: 0.6091 - val_accuracy: 0.7593 - lr: 1.2800e-08
Epoch 40/100
473/473 [==============================] - ETA: 0s - loss: 0.5314 - accuracy: 0.7896
Epoch 40: val_accuracy did not improve from 0.76713
473/473 [==============================] - 42s 88ms/step - loss: 0.5314 - accuracy: 0.7896 - val_loss: 0.6054 - val_accuracy: 0.7583 - lr: 2.5600e-09
Epoch 41/100
473/473 [==============================] - ETA: 0s - loss: 0.5287 - accuracy: 0.7948
Epoch 41: val_accuracy did not improve from 0.76713
473/473 [==============================] - 44s 93ms/step - loss: 0.5287 - accuracy: 0.7948 - val_loss: 0.6059 - val_accuracy: 0.7577 - lr: 2.5600e-09
Epoch 42/100
473/473 [==============================] - ETA: 0s - loss: 0.5367 - accuracy: 0.7877
Epoch 42: val_accuracy did not improve from 0.76713

Epoch 42: ReduceLROnPlateau reducing learning rate to 5.1200004236307e-10.
473/473 [==============================] - 43s 90ms/step - loss: 0.5367 - accuracy: 0.7877 - val_loss: 0.6013 - val_accuracy: 0.7627 - lr: 2.5600e-09
Epoch 43/100
473/473 [==============================] - ETA: 0s - loss: 0.5267 - accuracy: 0.7914
Epoch 43: val_accuracy did not improve from 0.76713
Restoring model weights from the end of the best epoch: 33.
473/473 [==============================] - 42s 89ms/step - loss: 0.5267 - accuracy: 0.7914 - val_loss: 0.6077 - val_accuracy: 0.7619 - lr: 5.1200e-10
Epoch 43: early stopping
In [86]:
# Plotting the accuracies

plt.figure(figsize = (10, 5))
plt.plot(history_6.history['accuracy'])
plt.plot(history_6.history['val_accuracy'])
plt.title('Accuracy - Complex Model')
plt.ylabel('Accuracy')
plt.xlabel('Epoch')
plt.legend(['Training', 'Validation'], loc='lower right')
plt.show()
In [87]:
# Plotting the losses

plt.figure(figsize = (10, 5))
plt.plot(history_6.history['loss'])
plt.plot(history_6.history['val_loss'])
plt.title('Loss - Complex Model')
plt.ylabel('loss')
plt.xlabel('epoch')
plt.legend(['Training', 'Validation'], loc='upper right')
plt.show()

Evaluating the Model on Test Set¶

In [88]:
# Evaluating the model's performance on the test set
accuracy = model_6.evaluate(test_set_grayscale)
4/4 [==============================] - 0s 36ms/step - loss: 0.5633 - accuracy: 0.7578

Observations and Insights:
Model 6, our Milestone 1 model, outperforms all previous models. After 33 epochs (best epoch), training accuracy stands at 0.79 and validation accuracy is 0.76. Accuracy and loss for both training and validation data improve similarly before leveling off. The model begins to overfit around epoch 15, but the overfitting is not as severe as previous models. The test accuracy for this model is 0.76. Overall, Model 6 generalizes better than previous models, and is the top performer thus far. That said, it is still an overfitting model, and thus it would not be advisable to deploy this model as is.

This model underwent numerous transformations before arriving at its final state. Parameters were tuned, layers were added, layers were removed, and eventually the above model was determined to be the best iteration. An abridged history of model development can be found in the table below.

The starting point for our final model was as follows:

CONVOLUTIONAL BLOCK #1

  • Conv2D(64,(2,2), input shape = (48,48,1), activation = 'relu', padding = 'same')
  • BatchNormalization
  • LeakyReLU(alpha = 0.1)
  • MaxPooling2D(2,2)

CONVOLUTIONAL BLOCK #2

  • Conv2D(128,(2,2), activation = 'relu', padding = 'same')
  • BatchNormalization
  • LeakyReLU(alpha = 0.1)
  • MaxPooling2D(2,2)

CONVOLUTIONAL BLOCK #3

  • Conv2D(512,(2,2), activation = 'relu', padding = 'same')
  • BatchNormalization
  • LeakyReLU(alpha = 0.1)
  • MaxPooling2D(2,2)

CONVOLUTIONAL BLOCK #4

  • Conv2D(256,(2,2), activation = 'relu', padding = 'same')
  • BatchNormalization
  • LeakyReLU(alpha = 0.1)
  • MaxPooling2D(2,2)

CONVOLUTIONAL BLOCK #5

  • Conv2D(128,(2,2), activation = 'relu', padding = 'same')
  • BatchNormalization
  • LeakyReLU(alpha = 0.1)
  • MaxPooling2D(2,2)

FINAL LAYERS

  • Flatten
  • Dense(256, activation = 'relu')
  • Dropout(0.1)
  • Dense(256, activation = 'relu')
  • Dropout(0.1)
  • Dense(4, activation = 'softmax')

PARAMETERS

  • Batch size = 32
  • horizontal_flip = True
  • rescale = 1./255
  • brightness_range = (0.0,2.0)
  • shear_range = 0.3

Below is an abridged summary of actions taken to improve the model. In many cases, parameters or layers were adjusted, added, or removed, just to be returned to their original state when the desired or experimental impact was not realized. The model went through dozens of iterations, with the following transformations being the most impactful.

Action Taken Train Loss Train Accuracy Val Loss Val Accuracy
Starting model as outlined above 0.77 0.70 0.89 0.58
Dropout(0.1) layers added to conv blocks 1 and 5 to reduce overfitting 0.75 0.74 0.66 0.61
Shear_range removed entirely to determine effect 0.76 0.74 0.68 0.60
Rotation_range added and optimized 0.74 0.74 0.62 0.61
Additional dropout layers added to blocks 2 and 4 0.59 0.78 0.64 0.68
Number of neurons in final dense layer set to 512 0.68 0.71 0.62 0.71
Number of neurons in block 4 increased to 512 0.70 0.73 0.60 0.74
Dropout layers swapped out for GaussianNoise in blocks 1 and 2 0.61 0.74 0.57 0.75
Brightness_range narrowed to (0.5,1.5) then to (0.7,1.3) 0.59 0.75 0.60 0.75
Kernel size enlarged to 3x3 in first then also second block 0.55 0.78 0.57 0.75
Dropout in block 5 reduced to 0.5, resulting in final model 0.54 0.79 0.60 0.76


Final Solution¶

Model 7: Goodbye Overfitting¶

While Model 6 was an improvement on previous models, it was still overfitting the training data. In order to feel comfortable recommending a model for deployment in the context of this project, we need to improve on Model 6. Model 7 is an attempt to develop a deployable CNN. We want our model to have high accuracy, while also maintaining a good fit (no overfitting/underfitting) and generalizing well to the unseen test data. We will continue with color_mode set to grayscale for the reasons already noted: slightly better performance, slightly fewer parameters, slightly lower computational expense, and the image data itself is already grayscale.

Creating our Data Loaders¶

We will once again be creating new data loaders for Model 7. As mentioned earlier, since our data augmentation takes place when we instantiate an ImageDataGenerator object, it is convenient to create data loaders specific to our new model so we can easily finetune our hyperparameters as needed. The ImageDataGenerators below include the parameters of our final, highest performing iteration of the model. They were once again chosen after exhaustive finetuning, as discussed later.

  • Batch size is set to 128. The model was tested with batch sizes of 16, 32, 64, 128, and 256. A batch size of 128 performed the best. The smallest batch sizes seemed to get stuck in an accuracy range of 25-30% (perhaps a local minimum), while the other sizes did not generalize as well to the test data.
  • horizontal_flip is set to 'True'. As some faces in the images face left while others face right or straight ahead, flipping the training images improves our model's ability to learn that horizontal orientation should not affect the eventual classification.
  • rescale is equal to 1./255, which normalizes the pixel values to a number between 0 and 1. This helps to prevent vanishing and exploding gradients in our network by keeping the numbers small and manageable.
  • brightness_range is set to '0.0,2.0'. This is a change from Model 6 where we used a narrower range. A narrower range did not help within the architecture of Model 7, and the broader range showed better performance.
  • shear_range is set to 0.3, which matches the settings of our baseline models. This parameter essentially distorts the image along an axis in a counter-clockwise direction.
  • one-hot-encoding is handled by setting class_mode to "categorical", followed by our list of classes.
  • Additional data augmentation methods were attempted and later removed after failing to significantly improve model performance. Among those tested were width_shift_range, height_shift_range, rotation_range, zca_whitening, zoom_range, and even vertical_flip.
In [89]:
batch_size  = 128

# Creating ImageDataGenerator objects for grayscale colormode 
datagen_train_grayscale = ImageDataGenerator(rescale=1./255, 
                                             brightness_range=(0.0,2.0), 
                                             horizontal_flip=True,
                                             shear_range=0.3)


datagen_validation_grayscale = ImageDataGenerator(rescale=1./255, 
                                             brightness_range=(0.0,2.0), 
                                             horizontal_flip=True,
                                             shear_range=0.3)


datagen_test_grayscale = ImageDataGenerator(rescale=1./255, 
                                             brightness_range=(0.0,2.0), 
                                             horizontal_flip=True,
                                             shear_range=0.3)



# Creating train, validation, and test sets for grayscale colormode

print("Grayscale Images")

train_set_grayscale = datagen_train_grayscale.flow_from_directory(dir_train,
                        target_size = (img_size, img_size),
                        color_mode = "grayscale",
                        batch_size = batch_size,
                        class_mode = 'categorical',
                        classes = ['happy', 'sad', 'neutral', 'surprise'],
                        seed = 42,
                        shuffle = True)

val_set_grayscale = datagen_validation_grayscale.flow_from_directory(dir_validation,
                        target_size = (img_size, img_size),
                        color_mode = "grayscale",
                        batch_size = batch_size,
                        class_mode = 'categorical',
                        classes = ['happy', 'sad', 'neutral', 'surprise'],
                        seed = 42,
                        shuffle = False)

test_set_grayscale = datagen_test_grayscale.flow_from_directory(dir_test,
                        target_size = (img_size, img_size),
                        color_mode = "grayscale",
                        batch_size = batch_size,
                        class_mode = 'categorical',
                        classes = ['happy', 'sad', 'neutral', 'surprise'],
                        seed = 42,
                        shuffle = False)
Grayscale Images
Found 15109 images belonging to 4 classes.
Found 4977 images belonging to 4 classes.
Found 128 images belonging to 4 classes.

Model Building¶

The structure of Model 7 is below. Rather than simply modifying Model 6, the development of Model 7 entailed going back to the drawing board and devising a new strategy. Many configurations were tested, and the following architecture led to the best, most generalizable performance.

  • The model begins with an input layer accepting an input shape of '48,48,1', given that our color_mode has been set to grayscale.
  • There are 3 similar convolutional blocks with relu activation. Padding is no longer set to "same", as this increased the generalization gap. Each block contains a BatchNormalization layer before its first and second convolutional layers (except the input layer in Block #1). Each block ends with MaxPooling and a Dropout layer set to 0.4.
  • A "secret" block, which is what eventually closed the generalization gap and eliminated overfitting, is essentially a normalization/regularization block consisting of a BatchNormalization layer and a convolutional layer without activation, but instead with a L2 regularization set to 0.025. This is followed by another BatchNormalization layer.
  • The output of the "secret" block is then flattened, and fed into 2 dense layers, each followed by a Dropout layer, and separated by a layer of GaussianNoise.
  • The architecture is completed with a softmax classifier, as this model is designed for multi-class classification. Test images will be classified as either happy, sad, neutral, or surprise.
  • The final model contains 1.8 million parameters and 27 layers, making it slightly less complex than Model 6, while still substantially more complex than our initial models.
In [90]:
# Creating a Sequential model
model_7 = Sequential()
 
# Convolutional Block #1
model_7.add(Conv2D(64, (3, 3), input_shape = (48, 48, 1), activation = 'relu'))
model_7.add(BatchNormalization())
model_7.add(Conv2D(64, (3, 3), activation = 'relu'))
model_7.add(MaxPooling2D(pool_size=(2, 2), strides=(2,2)))
model_7.add(Dropout(0.4))

# Convolutional Block #2
model_7.add(BatchNormalization())
model_7.add(Conv2D(128, (3, 3), activation='relu'))
model_7.add(BatchNormalization())
model_7.add(Conv2D(128, (3, 3), activation='relu'))
model_7.add(MaxPooling2D(pool_size = (2, 2), strides=(2,2)))
model_7.add(Dropout(0.4))

# Convolutional Block #3
model_7.add(BatchNormalization())
model_7.add(Conv2D(128, (3, 3), activation='relu'))
model_7.add(BatchNormalization())
model_7.add(Conv2D(128, (3, 3), activation='relu'))
model_7.add(MaxPooling2D(pool_size = (2, 2), strides=(2,2)))
model_7.add(Dropout(0.4))

# SECRET LEVEL
model_7.add(BatchNormalization())
model_7.add(Conv2D(128, (2, 2), kernel_regularizer = l2(0.025)))
model_7.add(BatchNormalization())

# Flatten layer
model_7.add(Flatten())

# Dense layers
model_7.add(Dense(1024, activation = 'relu'))
model_7.add(Dropout(0.2))
model_7.add(GaussianNoise(0.1))
model_7.add(Dense(1024, activation = 'relu'))
model_7.add(Dropout(0.2))

# Classifier
model_7.add(Dense(4, activation = 'softmax'))

model_7.summary()
Metal device set to: Apple M1 Pro
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 conv2d (Conv2D)             (None, 46, 46, 64)        640       
                                                                 
 batch_normalization (BatchN  (None, 46, 46, 64)       256       
 ormalization)                                                   
                                                                 
 conv2d_1 (Conv2D)           (None, 44, 44, 64)        36928     
                                                                 
 max_pooling2d (MaxPooling2D  (None, 22, 22, 64)       0         
 )                                                               
                                                                 
 dropout (Dropout)           (None, 22, 22, 64)        0         
                                                                 
 batch_normalization_1 (Batc  (None, 22, 22, 64)       256       
 hNormalization)                                                 
                                                                 
 conv2d_2 (Conv2D)           (None, 20, 20, 128)       73856     
                                                                 
 batch_normalization_2 (Batc  (None, 20, 20, 128)      512       
 hNormalization)                                                 
                                                                 
 conv2d_3 (Conv2D)           (None, 18, 18, 128)       147584    
                                                                 
 max_pooling2d_1 (MaxPooling  (None, 9, 9, 128)        0         
 2D)                                                             
                                                                 
 dropout_1 (Dropout)         (None, 9, 9, 128)         0         
                                                                 
 batch_normalization_3 (Batc  (None, 9, 9, 128)        512       
 hNormalization)                                                 
                                                                 
 conv2d_4 (Conv2D)           (None, 7, 7, 128)         147584    
                                                                 
 batch_normalization_4 (Batc  (None, 7, 7, 128)        512       
 hNormalization)                                                 
                                                                 
 conv2d_5 (Conv2D)           (None, 5, 5, 128)         147584    
                                                                 
 max_pooling2d_2 (MaxPooling  (None, 2, 2, 128)        0         
 2D)                                                             
                                                                 
 dropout_2 (Dropout)         (None, 2, 2, 128)         0         
                                                                 
 batch_normalization_5 (Batc  (None, 2, 2, 128)        512       
 hNormalization)                                                 
                                                                 
 conv2d_6 (Conv2D)           (None, 1, 1, 128)         65664     
                                                                 
 batch_normalization_6 (Batc  (None, 1, 1, 128)        512       
 hNormalization)                                                 
                                                                 
 flatten (Flatten)           (None, 128)               0         
                                                                 
 dense (Dense)               (None, 1024)              132096    
                                                                 
 dropout_3 (Dropout)         (None, 1024)              0         
                                                                 
 gaussian_noise (GaussianNoi  (None, 1024)             0         
 se)                                                             
                                                                 
 dense_1 (Dense)             (None, 1024)              1049600   
                                                                 
 dropout_4 (Dropout)         (None, 1024)              0         
                                                                 
 dense_2 (Dense)             (None, 4)                 4100      
                                                                 
=================================================================
Total params: 1,808,708
Trainable params: 1,807,172
Non-trainable params: 1,536
_________________________________________________________________

Compiling and Training the Model¶

In [91]:
# Creating a checkpoint which saves model weights from the best epoch
checkpoint = ModelCheckpoint('./model_7.h5', monitor='val_accuracy', verbose=1, save_best_only=True, mode='auto')

# Initiates early stopping if validation loss does not continue to improve
early_stopping = EarlyStopping(monitor = 'val_loss',
                          min_delta = 0,
                          patience = 5,
                          verbose = 1,
                          restore_best_weights = True)

# Slows the learning rate when validation loss does not improve
reduce_learningrate = ReduceLROnPlateau(monitor = 'val_loss',
                              factor = 0.2,
                              patience = 2,
                              verbose = 1,
                              min_delta = 0.0001)

callbacks_list = [checkpoint, early_stopping, reduce_learningrate]

Note:

  • Early stopping patience is set to 5 epochs. This model was trained with Patience set to 5, 10, 12, 15, 20, and 50. Each time, the model achieved the same results, so the simpler model (patience = 5) was chosen.
  • Reduce learning rate patience is set to 2 epochs. Again, the model was trained with patience set to 1, 2, 3, and 5. The results varied considerably, with 2 epochs being the only iteration that did not result in a generalization gap.
In [92]:
# Compiling model with optimizer set to Adam, loss set to categorical_crossentropy, and metrics set to accuracy
model_7.compile(optimizer = 'Adam', loss = 'categorical_crossentropy', metrics = ['accuracy'])
In [93]:
# Fitting model with epochs set to 200
history_7 = model_7.fit(train_set_grayscale, validation_data = val_set_grayscale, epochs = 200, callbacks = callbacks_list)
Epoch 1/200
119/119 [==============================] - ETA: 0s - loss: 2.4585 - accuracy: 0.2759
Epoch 1: val_accuracy improved from -inf to 0.19510, saving model to ./model_7.h5
119/119 [==============================] - 21s 164ms/step - loss: 2.4585 - accuracy: 0.2759 - val_loss: 1.5317 - val_accuracy: 0.1951 - lr: 0.0010
Epoch 2/200
119/119 [==============================] - ETA: 0s - loss: 1.3727 - accuracy: 0.3706
Epoch 2: val_accuracy improved from 0.19510 to 0.28752, saving model to ./model_7.h5
119/119 [==============================] - 18s 149ms/step - loss: 1.3727 - accuracy: 0.3706 - val_loss: 1.4891 - val_accuracy: 0.2875 - lr: 0.0010
Epoch 3/200
119/119 [==============================] - ETA: 0s - loss: 1.2177 - accuracy: 0.4810
Epoch 3: val_accuracy improved from 0.28752 to 0.32811, saving model to ./model_7.h5
119/119 [==============================] - 18s 147ms/step - loss: 1.2177 - accuracy: 0.4810 - val_loss: 1.3786 - val_accuracy: 0.3281 - lr: 0.0010
Epoch 4/200
119/119 [==============================] - ETA: 0s - loss: 1.0825 - accuracy: 0.5517
Epoch 4: val_accuracy improved from 0.32811 to 0.36186, saving model to ./model_7.h5
119/119 [==============================] - 18s 155ms/step - loss: 1.0825 - accuracy: 0.5517 - val_loss: 1.3412 - val_accuracy: 0.3619 - lr: 0.0010
Epoch 5/200
119/119 [==============================] - ETA: 0s - loss: 1.0186 - accuracy: 0.5914
Epoch 5: val_accuracy improved from 0.36186 to 0.47820, saving model to ./model_7.h5
119/119 [==============================] - 18s 154ms/step - loss: 1.0186 - accuracy: 0.5914 - val_loss: 1.1709 - val_accuracy: 0.4782 - lr: 0.0010
Epoch 6/200
119/119 [==============================] - ETA: 0s - loss: 0.9598 - accuracy: 0.6138
Epoch 6: val_accuracy improved from 0.47820 to 0.63432, saving model to ./model_7.h5
119/119 [==============================] - 17s 146ms/step - loss: 0.9598 - accuracy: 0.6138 - val_loss: 0.9410 - val_accuracy: 0.6343 - lr: 0.0010
Epoch 7/200
119/119 [==============================] - ETA: 0s - loss: 0.9179 - accuracy: 0.6325
Epoch 7: val_accuracy did not improve from 0.63432
119/119 [==============================] - 18s 155ms/step - loss: 0.9179 - accuracy: 0.6325 - val_loss: 1.0329 - val_accuracy: 0.6014 - lr: 0.0010
Epoch 8/200
119/119 [==============================] - ETA: 0s - loss: 0.9175 - accuracy: 0.6464
Epoch 8: val_accuracy improved from 0.63432 to 0.63954, saving model to ./model_7.h5
119/119 [==============================] - 18s 147ms/step - loss: 0.9175 - accuracy: 0.6464 - val_loss: 0.8771 - val_accuracy: 0.6395 - lr: 0.0010
Epoch 9/200
119/119 [==============================] - ETA: 0s - loss: 0.8740 - accuracy: 0.6580
Epoch 9: val_accuracy improved from 0.63954 to 0.68535, saving model to ./model_7.h5
119/119 [==============================] - 17s 146ms/step - loss: 0.8740 - accuracy: 0.6580 - val_loss: 0.8390 - val_accuracy: 0.6854 - lr: 0.0010
Epoch 10/200
119/119 [==============================] - ETA: 0s - loss: 0.8452 - accuracy: 0.6654
Epoch 10: val_accuracy did not improve from 0.68535
119/119 [==============================] - 17s 143ms/step - loss: 0.8452 - accuracy: 0.6654 - val_loss: 0.9580 - val_accuracy: 0.6201 - lr: 0.0010
Epoch 11/200
119/119 [==============================] - ETA: 0s - loss: 0.8622 - accuracy: 0.6655
Epoch 11: val_accuracy improved from 0.68535 to 0.70123, saving model to ./model_7.h5
119/119 [==============================] - 18s 152ms/step - loss: 0.8622 - accuracy: 0.6655 - val_loss: 0.7876 - val_accuracy: 0.7012 - lr: 0.0010
Epoch 12/200
119/119 [==============================] - ETA: 0s - loss: 0.8811 - accuracy: 0.6699
Epoch 12: val_accuracy did not improve from 0.70123
119/119 [==============================] - 17s 146ms/step - loss: 0.8811 - accuracy: 0.6699 - val_loss: 0.8592 - val_accuracy: 0.6821 - lr: 0.0010
Epoch 13/200
119/119 [==============================] - ETA: 0s - loss: 0.8345 - accuracy: 0.6801
Epoch 13: val_accuracy did not improve from 0.70123

Epoch 13: ReduceLROnPlateau reducing learning rate to 0.00020000000949949026.
119/119 [==============================] - 18s 153ms/step - loss: 0.8345 - accuracy: 0.6801 - val_loss: 0.7922 - val_accuracy: 0.6864 - lr: 0.0010
Epoch 14/200
119/119 [==============================] - ETA: 0s - loss: 0.7443 - accuracy: 0.7130
Epoch 14: val_accuracy improved from 0.70123 to 0.73237, saving model to ./model_7.h5
119/119 [==============================] - 18s 151ms/step - loss: 0.7443 - accuracy: 0.7130 - val_loss: 0.7052 - val_accuracy: 0.7324 - lr: 2.0000e-04
Epoch 15/200
119/119 [==============================] - ETA: 0s - loss: 0.7103 - accuracy: 0.7206
Epoch 15: val_accuracy improved from 0.73237 to 0.73719, saving model to ./model_7.h5
119/119 [==============================] - 18s 147ms/step - loss: 0.7103 - accuracy: 0.7206 - val_loss: 0.6913 - val_accuracy: 0.7372 - lr: 2.0000e-04
Epoch 16/200
119/119 [==============================] - ETA: 0s - loss: 0.7012 - accuracy: 0.7239
Epoch 16: val_accuracy did not improve from 0.73719
119/119 [==============================] - 17s 146ms/step - loss: 0.7012 - accuracy: 0.7239 - val_loss: 0.6851 - val_accuracy: 0.7316 - lr: 2.0000e-04
Epoch 17/200
119/119 [==============================] - ETA: 0s - loss: 0.7024 - accuracy: 0.7243
Epoch 17: val_accuracy did not improve from 0.73719
119/119 [==============================] - 17s 145ms/step - loss: 0.7024 - accuracy: 0.7243 - val_loss: 0.6888 - val_accuracy: 0.7296 - lr: 2.0000e-04
Epoch 18/200
119/119 [==============================] - ETA: 0s - loss: 0.6947 - accuracy: 0.7277
Epoch 18: val_accuracy did not improve from 0.73719

Epoch 18: ReduceLROnPlateau reducing learning rate to 4.0000001899898055e-05.
119/119 [==============================] - 18s 147ms/step - loss: 0.6947 - accuracy: 0.7277 - val_loss: 0.6891 - val_accuracy: 0.7314 - lr: 2.0000e-04
Epoch 19/200
119/119 [==============================] - ETA: 0s - loss: 0.6722 - accuracy: 0.7329
Epoch 19: val_accuracy improved from 0.73719 to 0.74242, saving model to ./model_7.h5
119/119 [==============================] - 18s 153ms/step - loss: 0.6722 - accuracy: 0.7329 - val_loss: 0.6699 - val_accuracy: 0.7424 - lr: 4.0000e-05
Epoch 20/200
119/119 [==============================] - ETA: 0s - loss: 0.6663 - accuracy: 0.7346
Epoch 20: val_accuracy improved from 0.74242 to 0.74342, saving model to ./model_7.h5
119/119 [==============================] - 18s 148ms/step - loss: 0.6663 - accuracy: 0.7346 - val_loss: 0.6622 - val_accuracy: 0.7434 - lr: 4.0000e-05
Epoch 21/200
119/119 [==============================] - ETA: 0s - loss: 0.6665 - accuracy: 0.7331
Epoch 21: val_accuracy did not improve from 0.74342
119/119 [==============================] - 18s 149ms/step - loss: 0.6665 - accuracy: 0.7331 - val_loss: 0.6600 - val_accuracy: 0.7416 - lr: 4.0000e-05
Epoch 22/200
119/119 [==============================] - ETA: 0s - loss: 0.6624 - accuracy: 0.7349
Epoch 22: val_accuracy did not improve from 0.74342
119/119 [==============================] - 17s 146ms/step - loss: 0.6624 - accuracy: 0.7349 - val_loss: 0.6577 - val_accuracy: 0.7434 - lr: 4.0000e-05
Epoch 23/200
119/119 [==============================] - ETA: 0s - loss: 0.6498 - accuracy: 0.7408
Epoch 23: val_accuracy improved from 0.74342 to 0.74623, saving model to ./model_7.h5
119/119 [==============================] - 17s 145ms/step - loss: 0.6498 - accuracy: 0.7408 - val_loss: 0.6563 - val_accuracy: 0.7462 - lr: 4.0000e-05
Epoch 24/200
119/119 [==============================] - ETA: 0s - loss: 0.6526 - accuracy: 0.7377
Epoch 24: val_accuracy did not improve from 0.74623
119/119 [==============================] - 17s 145ms/step - loss: 0.6526 - accuracy: 0.7377 - val_loss: 0.6579 - val_accuracy: 0.7412 - lr: 4.0000e-05
Epoch 25/200
119/119 [==============================] - ETA: 0s - loss: 0.6451 - accuracy: 0.7409
Epoch 25: val_accuracy improved from 0.74623 to 0.74784, saving model to ./model_7.h5
119/119 [==============================] - 18s 155ms/step - loss: 0.6451 - accuracy: 0.7409 - val_loss: 0.6534 - val_accuracy: 0.7478 - lr: 4.0000e-05
Epoch 26/200
119/119 [==============================] - ETA: 0s - loss: 0.6475 - accuracy: 0.7390
Epoch 26: val_accuracy did not improve from 0.74784
119/119 [==============================] - 18s 152ms/step - loss: 0.6475 - accuracy: 0.7390 - val_loss: 0.6450 - val_accuracy: 0.7436 - lr: 4.0000e-05
Epoch 27/200
119/119 [==============================] - ETA: 0s - loss: 0.6451 - accuracy: 0.7389
Epoch 27: val_accuracy improved from 0.74784 to 0.74844, saving model to ./model_7.h5
119/119 [==============================] - 17s 143ms/step - loss: 0.6451 - accuracy: 0.7389 - val_loss: 0.6431 - val_accuracy: 0.7484 - lr: 4.0000e-05
Epoch 28/200
119/119 [==============================] - ETA: 0s - loss: 0.6431 - accuracy: 0.7427
Epoch 28: val_accuracy did not improve from 0.74844
119/119 [==============================] - 18s 147ms/step - loss: 0.6431 - accuracy: 0.7427 - val_loss: 0.6518 - val_accuracy: 0.7412 - lr: 4.0000e-05
Epoch 29/200
119/119 [==============================] - ETA: 0s - loss: 0.6350 - accuracy: 0.7465
Epoch 29: val_accuracy did not improve from 0.74844

Epoch 29: ReduceLROnPlateau reducing learning rate to 8.000000525498762e-06.
119/119 [==============================] - 17s 146ms/step - loss: 0.6350 - accuracy: 0.7465 - val_loss: 0.6473 - val_accuracy: 0.7458 - lr: 4.0000e-05
Epoch 30/200
119/119 [==============================] - ETA: 0s - loss: 0.6359 - accuracy: 0.7452
Epoch 30: val_accuracy did not improve from 0.74844
119/119 [==============================] - 20s 166ms/step - loss: 0.6359 - accuracy: 0.7452 - val_loss: 0.6513 - val_accuracy: 0.7422 - lr: 8.0000e-06
Epoch 31/200
119/119 [==============================] - ETA: 0s - loss: 0.6340 - accuracy: 0.7468
Epoch 31: val_accuracy did not improve from 0.74844

Epoch 31: ReduceLROnPlateau reducing learning rate to 1.6000001778593287e-06.
119/119 [==============================] - 20s 166ms/step - loss: 0.6340 - accuracy: 0.7468 - val_loss: 0.6469 - val_accuracy: 0.7450 - lr: 8.0000e-06
Epoch 32/200
119/119 [==============================] - ETA: 0s - loss: 0.6317 - accuracy: 0.7463
Epoch 32: val_accuracy did not improve from 0.74844
119/119 [==============================] - 19s 158ms/step - loss: 0.6317 - accuracy: 0.7463 - val_loss: 0.6409 - val_accuracy: 0.7454 - lr: 1.6000e-06
Epoch 33/200
119/119 [==============================] - ETA: 0s - loss: 0.6375 - accuracy: 0.7435
Epoch 33: val_accuracy did not improve from 0.74844
119/119 [==============================] - 16s 139ms/step - loss: 0.6375 - accuracy: 0.7435 - val_loss: 0.6499 - val_accuracy: 0.7436 - lr: 1.6000e-06
Epoch 34/200
119/119 [==============================] - ETA: 0s - loss: 0.6296 - accuracy: 0.7457
Epoch 34: val_accuracy did not improve from 0.74844
119/119 [==============================] - 18s 148ms/step - loss: 0.6296 - accuracy: 0.7457 - val_loss: 0.6377 - val_accuracy: 0.7472 - lr: 1.6000e-06
Epoch 35/200
119/119 [==============================] - ETA: 0s - loss: 0.6268 - accuracy: 0.7464
Epoch 35: val_accuracy did not improve from 0.74844
119/119 [==============================] - 17s 145ms/step - loss: 0.6268 - accuracy: 0.7464 - val_loss: 0.6451 - val_accuracy: 0.7422 - lr: 1.6000e-06
Epoch 36/200
119/119 [==============================] - ETA: 0s - loss: 0.6312 - accuracy: 0.7454
Epoch 36: val_accuracy did not improve from 0.74844

Epoch 36: ReduceLROnPlateau reducing learning rate to 3.200000264769187e-07.
119/119 [==============================] - 17s 143ms/step - loss: 0.6312 - accuracy: 0.7454 - val_loss: 0.6426 - val_accuracy: 0.7464 - lr: 1.6000e-06
Epoch 37/200
119/119 [==============================] - ETA: 0s - loss: 0.6247 - accuracy: 0.7494
Epoch 37: val_accuracy did not improve from 0.74844
119/119 [==============================] - 17s 142ms/step - loss: 0.6247 - accuracy: 0.7494 - val_loss: 0.6382 - val_accuracy: 0.7426 - lr: 3.2000e-07
Epoch 38/200
119/119 [==============================] - ETA: 0s - loss: 0.6301 - accuracy: 0.7474
Epoch 38: val_accuracy improved from 0.74844 to 0.74945, saving model to ./model_7.h5

Epoch 38: ReduceLROnPlateau reducing learning rate to 6.400000529538374e-08.
119/119 [==============================] - 17s 144ms/step - loss: 0.6301 - accuracy: 0.7474 - val_loss: 0.6422 - val_accuracy: 0.7494 - lr: 3.2000e-07
Epoch 39/200
119/119 [==============================] - ETA: 0s - loss: 0.6324 - accuracy: 0.7408
Epoch 39: val_accuracy did not improve from 0.74945
Restoring model weights from the end of the best epoch: 34.
119/119 [==============================] - 17s 141ms/step - loss: 0.6324 - accuracy: 0.7408 - val_loss: 0.6408 - val_accuracy: 0.7490 - lr: 6.4000e-08
Epoch 39: early stopping
In [94]:
# Plotting the accuracies

plt.figure(figsize = (10, 5))
plt.plot(history_7.history['accuracy'])
plt.plot(history_7.history['val_accuracy'])
plt.title('Accuracy - Final Model')
plt.ylabel('Accuracy')
plt.xlabel('Epoch')
plt.legend(['Training', 'Validation'], loc='lower right')
plt.show()
In [95]:
# Plotting the losses

plt.figure(figsize = (10, 5))
plt.plot(history_7.history['loss'])
plt.plot(history_7.history['val_loss'])
plt.title('Loss - Final Model')
plt.ylabel('Loss')
plt.xlabel('Epoch')
plt.legend(['Training', 'Validation'], loc='upper right')
plt.show()

Evaluating the Model on Test Set¶

In [96]:
# Evaluating the model's performance on the test set
accuracy = model_7.evaluate(test_set_grayscale)
1/1 [==============================] - 0s 359ms/step - loss: 0.6198 - accuracy: 0.7500

Observations and Insights:

Model 7, rewarding us for all of our efforts, displays the best all-around performance. Accuracies for training, validation, and test data are stable at 0.75, while loss is stable across training, validation, and test data at roughly 0.63 (0.62 to 0.64). As evidenced by the above graphs, there is no noticeable generalization gap. The accuracy and loss curves move more or less in tandem, leveling off around epoch 25 and remaining together from that point forward. The model does not overfit or underfit the training data. The images below show the accuracy and loss curves for the same model run out to 115 epochs. The model converges at reasonable levels of accuracy and loss, and it generalizes well.

Drawing Drawing


Much like Model 6, this model underwent numerous transformations before arriving at its final state. Parameters were tuned, layers were added, others were removed, and in the end, the above iteration of the model was determined to be the best. Below are the impacts that some of the most important aspects of the model have on its overall performance. While some individual metrics may be better than those of the final model, each of the modifications below, if taken individually or in tandem, results in a generalization gap that is not present in the final model.

Model Changes Train Loss Train Accuracy Val Loss Val Accuracy
Final Model 0.63 0.75 0.64 0.75
Remove "regularization" block 0.63 0.76 0.68 0.73
Remove L2 kernel regularizer 0.62 0.74 0.64 0.73
Remove Gaussian Noise 0.65 0.73 0.66 0.74
Reduce kernel size to (2,2) 0.63 0.74 0.66 0.74
Dropout levels reduced to 0.2 0.57 0.78 0.65 0.74
Remove BatchNormalization 0.74 0.70 0.69 0.72
Include relu activation in regularization block 0.63 0.74 0.63 0.74
Batch size = 32 0.62 0.75 0.65 0.74
Data augmentation with rotation range = 20 0.69 0.72 0.67 0.74
Data augmentation with zoom range = 0.2 0.71 0.71 0.69 0.73
Vertical flip = True 0.74 0.71 0.70 0.74
Only 1 convolutional layer per block 0.84 0.66 0.78 0.70


Model Comparison¶

Below are the accuracy and loss scores for each of our models, first in a tabular format, then represented visually in the form of bar charts.

Parameters Train Loss Train Accuracy Val Loss Val Accuracy Test Loss Test Accuracy
Model 1.1: Baseline Grayscale 605,060 0.68 0.72 0.78 0.68 0.82 0.65
Model 1.2: Baseline RGB 605,572 0.68 0.72 0.78 0.68 0.80 0.63
Model 2.1: 2nd Gen Grayscale 455,780 0.54 0.78 0.74 0.71 0.81 0.69
Model 2.2: 2nd Gen RGB 457,828 0.59 0.76 0.72 0.71 0.70 0.68
Model 3: VGG16 14,714,688 0.71 0.72 0.80 0.67 0.74 0.66
Model 4: ResNet V2 42,658,176 1.43 0.26 1.35 0.36 1.40 0.28
Model 5: EfficientNet 8,769,374 1.39 0.26 1.37 0.24 1.40 0.25
Model 6: Milestone 1 2,119,172 0.54 0.79 0.60 0.76 0.56 0.76
Model 7: Final Model 1,808,708 0.63 0.75 0.64 0.75 0.62 0.75
In [97]:
# creating a dictionary containing model accuracies
dict_model_acc = {
    "Model": ["1.1", "1.2", "2.1", "2.2", "3", "4", "5", "6", "7"],
    "Train": [0.72, 0.72, 0.78, 0.76, 0.72, 0.26, 0.26, 0.79, 0.75],
    "Validate": [0.68, 0.68, 0.71, 0.71, 0.67, 0.36, 0.24, 0.76, 0.75],
    "Test": [0.65, 0.63, 0.69, 0.68, 0.66, 0.28, 0.25, 0.76, 0.75]}

# converting dictionary to dataframe
df_model_acc = pd.DataFrame.from_dict(dict_model_acc)


# plotting accuracy scores for all models
df_model_acc.groupby("Model", sort=False).mean().plot(kind='bar', figsize=(10,5), 
                            title="Accuracy Scores Across Models", 
                            ylabel="Accuracy Score", xlabel="Models", rot=0, fontsize=12, width=0.9, colormap="Pastel2", 
                            edgecolor='black')
plt.legend(loc=(.59, 0.77))
plt.show()


# creating a dictionary containing model loss
dict_model_loss = {
    "Model": ["1.1", "1.2", "2.1", "2.2", "3", "4", "5", "6", "7"],
    "Train": [0.68, 0.68, 0.54, 0.59, 0.71, 1.43, 1.39, 0.54, 0.63],
    "Validate": [0.78, 0.78, 0.74, 0.72, 0.80, 1.35, 1.37, 0.60, 0.64],
    "Test": [0.82, 0.80, 0.81, 0.70, 0.74, 1.40, 1.40, 0.56, 0.62]}

# converting dictionary to dataframe
df_model_loss = pd.DataFrame.from_dict(dict_model_loss)

# plotting loss scores for all models
df_model_loss.groupby("Model", sort=False).mean().plot(kind='bar', figsize=(10,5), 
                            title="Loss Scores Across Models", 
                            ylabel="Loss Score", xlabel="Models", rot=0, fontsize=12, width=0.9, colormap="Pastel2", 
                            edgecolor='black')
plt.show()

Observations and Insights:

The above graphs perfectly depict the overfitting that occurs in Models 1.1, 1.2, 2.1, 2.2, and 3, with accuracy scores declining in steps as we move from training, to validation, and on to test data. The opposite is true for the loss scores. The graphs also show the total dysfunction of Models 4 and 5, with very low accuracy and very high error scores. It is also clear from the graphs that Models 6 and 7 are the most consistent, most generalizable models, and that a final decision regarding a deployable model should be made between those two options.

In deciding between Models 6 and 7, it is useful to revisit the accuracy and loss curves for the two models.

Accuracy and loss curves for Model 6:¶

Drawing Drawing

Accuracy and loss curves for Model 7:¶

Drawing Drawing

While the accuracy and loss curves for the two models both stabilize by epoch 20-25, there is no gap between accuracy and loss curves for Model 7, while a slight gap does exist for Model 6. The accuracy and loss scores are all individually better for Model 6 (higher accuracy and lower loss), but when viewed together, the spread within the two scores is larger for Model 6, while it is virtually nonexistent in Model 7. It is difficult to justify deploying a slightly overfitting model when a slightly less accurate but more generalizable model is available. Model 7 will be our final model.

Plotting the Confusion Matrix for Model 7¶

In [98]:
test_set = datagen_test_grayscale.flow_from_directory(dir_test,
                                              target_size = (img_size, img_size),
                                              color_mode = "grayscale",
                                              batch_size = 128,
                                              class_mode = 'categorical',
                                              classes = ['happy', 'sad', 'neutral', 'surprise'],
                                              seed = 42,
                                              shuffle = False)

test_images, test_labels = next(test_set)

pred = model_7.predict(test_images)
pred = np.argmax(pred, axis = 1) 
y_true = np.argmax(test_labels, axis = 1)

# Printing the classification report
print(classification_report(y_true, pred))

# Plotting the heatmap using the confusion matrix
cm = confusion_matrix(y_true, pred)
plt.figure(figsize = (8, 5))
sns.heatmap(cm, annot = True,  fmt = '.0f', xticklabels = ['happy', 'sad', 'neutral', 'surprise'], yticklabels = ['happy', 'sad', 'neutral', 'surprise'])
plt.ylabel('Actual')
plt.xlabel('Predicted')
plt.show()
Found 128 images belonging to 4 classes.
4/4 [==============================] - 0s 16ms/step
              precision    recall  f1-score   support

           0       0.79      0.84      0.82        32
           1       0.67      0.62      0.65        32
           2       0.62      0.72      0.67        32
           3       0.96      0.81      0.88        32

    accuracy                           0.75       128
   macro avg       0.76      0.75      0.75       128
weighted avg       0.76      0.75      0.75       128

Observations and Insights:

  • As noted above, our final model achieves an accuracy score of 0.75 on the test images. The model correctly predicted 96 of 128 images.
  • The choice to prioritize precision (TP/(TP+FP)) or recall (TP/(TP+FN)) depends entirely on the model's end use. If the stakes are high, and false negatives should be avoided at all costs, than recall is more important. If reducing the number of false positives is more important, than precision is the better choice. In the case of our model, no trade-off is necessary, with precision and recall scores essentially the same (precision = 0.76, recall = 0.75, F1 = 0.75).
  • As previewed during the data visualization phase of the project, the 'happy' and 'surprise' images seemed to have the most unique characteristics, and this hypothesis appears to have played out in the classification report and confusion matrix. Happy and surprise have the highest precision and recall scores (and consequently, F1 scores) of the 4 classes.
  • Additionally, 'sad' and 'neutral' images were in fact more likely to be confused with one another, as discussed during the data visualization phase. When the model misclassified a sad image, it was most likely to be mistaken for a neutral image, and vice versa.
  • Any concern about a slightly skewed class distribution can be put to rest. As previewed, the surprise images, which were outnumbered in the training and validation data, were unique enough to identify correctly despite representing a smaller proportion of training images. It is possible that our earlier finding re: elevated average pixel values for surprise images has played a role, along with the unique characteristics of surprise images, including open mouths and wide open eyes.
  • As discussed during the data visualization phase, now in the context of the confusion matrix, it should be pointed out once again that the term "accuracy" can be misleading. There are training, validation, and test images of smiling people that are labeled as "sad", while there are images of frowning people labeled as "happy", etc. If the model classifies a test image as "sad" even though the person is smiling, and in fact the test image is incorrectly classified as "sad", making the prediction accurate, should we really consider that as accurate? Or would the accurate prediction be when the model overrules the misclassified test image and, from a human's perspective, accurately classifies the image as "happy"? For this reason, the test scores and confusion matrix should be taken with a grain of salt.
  • Similarly, there is a test image that does not contain a face at all. As there are similar images across all four classes within the training data, a correct prediction of the empty test image would seem to be pure chance. Should an accurate prediction in this case really increase the model's perceived accuracy? Should an incorrect prediction of an empty test image really lower the model's perceived accuracy? It seems that any model that correctly predicts all 128 test images benefited from some degree of luck. Again, these final scores should be viewed with some degree of skepticism, but that skepticism would be similar across all models.

Visualizing Images: Actual Class Label vs Predicted Class Label¶

In [99]:
# Making predictions on the test data
y_pred_test = model_7.predict(test_set)

# Converting probabilities to class labels
y_pred_test_classes = np.argmax(y_pred_test, axis = 1)

# Calculating the probability of the predicted class
y_pred_test_max_probas = np.max(y_pred_test, axis = 1)

classes = ['happy', 'sad', 'neutral', 'surprise']

rows = 3

cols = 4

fig = plt.figure(figsize = (12, 12))

for i in range(cols):

    for j in range(rows):
        random_index = np.random.randint(0, len(test_labels))       # generating random integer

        ax = fig.add_subplot(rows, cols, i * rows + j + 1)

        ax.imshow(test_images[random_index, :])                     # selecting random test image

        pred_label = classes[y_pred_test_classes[random_index]]     # predicted label of selected image

        pred_proba = y_pred_test_max_probas[random_index]           # probability associated with model's prediction

        true_label = test_labels[random_index]                      # actual class label of selected image

        if true_label[0] == 1:                                      # converting array to class labels
            true_label = "happy"
        elif true_label[1] == 1:
            true_label = "sad"
        elif true_label[2] == 1:
            true_label = "neutral"
        else:
            true_label = "surprise"
        
        ax.set_title("actual: {}\npredicted: {}\nprobability: {:.3}\n".format(
               true_label, pred_label, pred_proba))
                
plt.gray()  
plt.show()
1/1 [==============================] - 0s 47ms/step

Observations and Insights:

  • As predicted during the data visualization phase of the project, and confirmed via the confusion matrix, we again see that "sad" and "neutral" images are the most likely to be confused for one another. All three of the errors made by our model on the 12 random images above were of the sad/neutral variety. While the model was quite confident in one case (0.90), it was not at all confident in the other two incorrect predictions (0.41 and 0.53).
  • As theorized earlier, the predictions above also seem to point to the characteristic "surprise" face (wide open mouth and eyes) as being relatively easy for the model to learn, despite the training data being somewhat imbalanced against "surprise" images. Images that are clearly "surprise" to the human eye are all correctly predicted by the model with very high confidence (0.99+).
  • The model's accuracy on the 12 random images above (9 correct out of 12) is consistent with its training, validation, and test accuracy: 0.75.

Conclusion:¶

Over the course of this project, we have thoroughly explored the ins and outs of the given data through visualization and analysis, developed 9 different convolutional neural networks, and drawn many insights from our observations along the way. Though much has already been discussed as we have gone along, a summary of the problem, our findings, and recommendations for implementation can be found below.

Problem and Solution Summary¶

As noted at the outset of this project, someone's facial expression can be a powerful window into their true feelings, and as such, can be used as a highly-effective proxy for sentiment. Emotion AI (affective computing) attempts to leverage this proxy by detecting and processing facial expression, through neural networks, in an effort to successfully interpret human emotion and respond appropriately. Developing models that can accurately detect facial emotion is therefore an important driver of advancement in the realm of artificial intelligence and emotionally intelligent machines.

The objective of this project was to utilize deep learning techniques to create a computer vision model that can accurately detect and interpret facial emotions. This model should be capable of performing multi-class classification on images containing one of four facial expressions: happy, sad, neutral, and surprise. As discussed earlier, convolutional neural networks are currently the most effective algorithmic tool available for processing images, so our solution takes the form of a CNN.

Over the course of this project, 9 CNNs were developed (with colormode variations RGB and grayscale). Before model development, the data was visually analyzed and then augmented based on that analysis, the specifics of which depended on the individual model being developed. Models ranged from simple, baseline models to much more complex architectures, including transfer learning models. Ultimately, our final model was chosen for its relatively high accuracy (compared to the other models) and, more importantly, because it is highly generalizable. A tabular and graphical summary of model performance is below.

Parameters Train Loss Train Accuracy Val Loss Val Accuracy Test Loss Test Accuracy
Model 1.1: Baseline Grayscale 605,060 0.68 0.72 0.78 0.68 0.82 0.65
Model 1.2: Baseline RGB 605,572 0.68 0.72 0.78 0.68 0.80 0.63
Model 2.1: 2nd Gen Grayscale 455,780 0.54 0.78 0.74 0.71 0.81 0.69
Model 2.2: 2nd Gen RGB 457,828 0.59 0.76 0.72 0.71 0.70 0.68
Model 3: VGG16 14,714,688 0.71 0.72 0.80 0.67 0.74 0.66
Model 4: ResNet V2 42,658,176 1.43 0.26 1.35 0.36 1.40 0.28
Model 5: EfficientNet 8,769,374 1.39 0.26 1.37 0.24 1.40 0.25
Model 6: Milestone 1 2,119,172 0.54 0.79 0.60 0.76 0.56 0.76
Model 7: Final Model 1,808,708 0.63 0.75 0.64 0.75 0.62 0.75


Drawing

The architecture for our final model (Model 7) is more complex than our baseline models, but not nearly as complex as the VGG16, ResNet, or EfficientNet transfer learning models that were developed. Model 7 consists of three fairly standard convolutional blocks with relu activation, BatchNormalization, MaxPooling, and Dropout layers. The critical block that conquered overfitting and removed the generalization gap was a regularization block consisting of BatchNormalization layers and a convolutional layer with L2 regularization. Two additional key features of Model 7 are heavy usage of BatchNormalization throughout the architecture, as well as the addition of GaussianNoise between the two fully-connected layers.

The combination of the above features delivered a model with training, validation, and test accuracy of 0.75. While 75% accuracy may not seem particularly high, correctly classifying the FER2013 dataset, from which it appears our data was drawn, is extremely challenging, with human-level accuracy standing at just ±65%. So Model 7 may be more accurate at classifying our dataset than a human, but whether or not 75% accuracy is ultimately high enough for deployment depends entirely on the business use and the cost that would be incurred in any efforts made to improve model performance.

For example, if this computer vision model is being developed to create photo filters for a phone application, perhaps an accuracy of 0.75 is sufficient. It is better than random guessing (0.25) and also better than a human being (0.65). As the stakes in this instance are pretty low, 75% accuracy would likely suffice for model deployment, particularly if 75% accuracy is higher than that of similar phone applications on the market. If, on the other hand, this computer vision model is being developed for use in some sort of life or death medical situation, 75% accuracy may be too low, and improvement might justify the additional expenses incurred.

Recommendations for Implementation¶

The spectrum of possible use cases for Emotion AI in general, and facial emotion recognition technology in particular, is so broad that it is difficult to give a general set of recommendations for implementation. It very much comes down to the specific use case for each business, organization, or government.

The first big question to answer is the following: will the collection of this private, emotional data require consent (or opting in) from the individual whose facial emotions are being recorded and analyzed? If consent is required and granted, that makes it easier from a business perspective, as long as the consent given by the individual was based on truthful, transparent terms, and the business lives up to its end of the agreement. If, however, a computer vision model will be used to extract data from individuals without their consent or knowledge, that puts a business, organization, or government in a much more vulnerable position, with huge potential for a privacy rights related backlash and consequent loss of reputation, brand loyalty, market share, legitimacy, etc.

Model 7, with an accuracy of 75 percent, should be considered deployable in some circumstances under certain conditions, and should absolutely not be considered deployable in other circumstances. For example, if a company is analyzing someone's facial reaction to an advertisement (with their permission) in an effort to better target future advertisement campaigns or decide what customer demographic should receive a particular coupon in the mail, than 75 percent accuracy (again, with permission) is perfectly reasonable. If, on the other hand, the intention is to deploy this computer vision model in a situation that can materially impact someone's life in a serious way (denying a loan, denying a job, deciding guilt or innocence in a court of law, student performance in school, etc.), than 75 percent accuracy is nowhere near what it would need to be. On top of that, we should be giving serious thought as to whether or not even the most accurate computer vision model should be deployed in those situations anyways.

For the sake of this exercise, let us assume that a business is interested in our computer vision model to better understand how their advertising campaigns are perceived by current and potential customers. Some key recommendations would be:

  • Determine whether or not a computer vision model is even needed in this circumstance, or if there is a less intrusive way to obtain the desired data that may be equally effective.
  • Revisit the training dataset and determine whether or not it is a representative enough sample for our purposes. If it is not representative enough, or is simply too small, retrain the model on the appropriate data rather than deploy a model with known biases.
  • Gain informed consent grounded in transparency.
  • Ensure that the model will only be used in the context that was agreed to by the individual, and that it will not be sold to a third party or used in anyway that could negatively impact the individual.

Assuming the above to be true, stakeholder actionables could include:

  • Ensure ethical standards are being monitored and maintained in terms of data privacy.
  • Monitor the return on advertisement spending to determine if our computer vision model is having an impact on the bottom line.
  • Consider other use cases for computer vision models, like in store cameras that can assist in identifying what customers might be interested in purchasing and alerting a salesperson who can then act on that information. This obviously moves away from informed consent, so that would need to be taken into consideration. Would signs indicating that cameras were recording be sufficient? Is it worth a potential privacy-backlash or lawsuit if data is mishandled or used in new ways down the road?

Associated costs include:

  • Depending on the size of the company, anywhere from 2-12% of sales revenue is generally allocated to marketing. A larger company would likely not have much difficulty absorbing the cost of deploying a computer vision model if management believed the investment would pay for itself (and then some) with increased sales.
  • A smaller company, with a correspondingly smaller marketing budget, might have a more difficult time funding the deployment of a computer vision model, however, in many ways, it has more to gain than the larger company. Smaller businesses need any advantage they can get, and in this case, a bit of investment up front could potentially save them a large chunk of a marketing budget wasted advertising the wrong products to the wrong (potential) customers.
  • A serious cost to consider when deploying a computer vision model like this is what happens in the case of a data leak, or if the data is mishandled in some way resulting in a lawsuit or penalties of some kind. For small and large businesses alike, a mistake like that could be fatal.

The upside to deploying Emotion AI technology like our computer vision model is huge:

  • Insights generated by Emotion AI data could help a business improve their advertising/marketing strategies, learn which products or messaging most resonates with existing and potential customers, and use this knowledge to better target their marketing campaigns.
  • Improved marketing can lead to increased customer satisfaction and long-term brand loyalty. A business that can provide a customer with the right product at the right time in a hyper-personalized manner is more likely to keep that customer's business in the future.

Key risks and challenges include several issues already discussed:

  • The first risk involves the principle assumption upon which the entire facial emotion detection/recognition model is built: that smiling people are happy, frowning people are sad, etc. This is obviously not always the case. Expressions vary from person to person, from culture to culture, and even for the same person from one moment to the next. People can be smiling yet be feeling no emotion. People can be happy or sad yet have a neutral face. Someone could be feeling a compound emotion, like being happily surprised or sadly surprised. Our model may have an accuracy of 75%, but that is only with respect to preexisting class labels. Our model may identify a potential customer viewing our marketing as smiling, but it does not necessarily follow that the person is happy or enjoying the advertisement. In essence, the model may work, but the science behind the assumptions may not.
  • Improper handling of deeply personal data could be catastrophic for a business of any size, but particularly a small business. Ethical guidelines pertaining to the collection and use of this data must be put into place and monitored to ensure data privacy is maintained above all else.
  • Emotion AI, and AI in general, is developing so quickly that governments and regulators are unable to keep pace. It is possible that in the short- to medium-term, once regulators are able to better assess the situation and gain consensus on any legislation, Emotion AI technology may be subject to regulations, including computer vision models such as the one we have developed. If this occurs before our model is deployed and insights are gained, it would be a wasted investment.
  • If there is a massive data breach in the future related to Emotion AI, whether it be at a large company/organization or a government agency, it could generate such bad publicity and public sentiment that any business utilizing or at all associated with the technology could face a data privacy related backlash.

Potential further action:

  • If the company is large enough to have government affairs personnel, the company should build and maintain relationships with government agencies and officials that are likely to be responsible for future legislation that may impact Emotion AI and its use by businesses and organizations.
  • Regarding the computer vision model we have developed, it is important to know where the training images came from. How were they sourced? Did the subjects in the photos grant permission for their images to be used in this way? It seems unlikely to be the case, as many training images have watermarks and appear to be scraped from the Internet. It would be more ethical to procure or develop a set of training images that are in line with the ethical guidelines developed by the company. This would obviously entail revising the model based on new training data, but safeguarding the privacy of the individuals in the photos should be paramount. A positive impact on the bottom line is important, but before all else, do no harm. Emotion AI is likely to be a controversial topic in the years ahead, and it makes better business sense to see what is coming down the road and act now in a way that protects everyone involved.
In [ ]: